Hello. I'm the current maintainer of sparklyr (a R interface for Apache Spark) and a few sparklyr extensions such as sparklyr.flint.Sparklyr was fortunate to receive some contribution from Databricks folks, which enabled R users to run `spark_connect...
Hi @Hubert Dudek​ ​ , Just a friendly follow-up. Do you still need help, or @Prabakar Ammeappin​'s response help you to find the solution? Please let us know.
Hi @Borislav Blagoev​ , Just a friendly follow-up. Do you still need help, or does the above response help you to find the solution? Please let us know.
We have a streaming data written into delta. We will not write all the partitions every day. Hence i am thinking of running compact spark job, to run only on partitions that has been modified yesterday. Is it possible to query the partitionsValues wr...
Hi @Gnanasoundari Soundarajan​ , Just a friendly follow-up. Do you still need help, or @Deepak Bhutada​ 's response help you to find the solution? Please let us know.
I'm new to Spark and trying to understand how some of its components work.I understand that once the data is loaded into the memory of separate nodes, they process partitions in parallel, within their own memory (RAM).But I'm wondering whether the in...
Hi @Narek Margaryan​, Just a friendly follow-up. Do you still need help, or does the above response help you to find the solution? Please let us know.
Hi all,I currently have a cluster configured in databricks with spark-xml (version com.databricks:spark-xml_2.12:0.13.0) which was installed using Maven. The spark-xml library itself works fine with Pyspark when I am using it in a notebook within th...
I am using Azure Databricks Runtime (DBR) 8.3 ML with Python notebook and R cells together.I want to use "tidyverse" and one of the dependency is rlang >= 0.4.10 and the base DBR 8.3 ML provides rlang @ 0.4.9. I successfully upgraded the R package t...
I have setup a virtual environment inside my existing hadoop cluster. Since the current cluster does not have spark >3 , so i installed delta spark using virtual environment. While trying to access the hdfs which is kerberose one, Getting below error...
Hi,
I would like to do some web scrapping, however I am unable to import the libraries I traditionally use for that task
import requests
from bs4 import BeautifulSoup
Hi @Ikram Mecheri​ ​ , Just a friendly follow-up. Do you still need help, or do the above responses help you find the solution? Please let us know.
Hello, how do I run a scala script from a Terminal on Databricks - Web Terminal, or from a cell with %sh just doing `scala -nc script.scala` is not working.Thanks,
Hello!I am attempting to move some machine learning code from a databricks notebook into a mlflow git repository. I am utilizing the databricks feature store to load features that have been processed. Currently I cannot get the databricks library to ...
Hello everyone! First post on the forums, been stuck at this for awhile now and cannot seem to understand why this is happening. Basically, I have been using a seems to be premade Databricks notebook from Databricks themselves for a DNS Analytics exa...
Hi @NickGoodfella​ , Just a friendly follow-up. Do you still need help, or @Sean Owen​'s response help you to find the solution? Please let us know.
May I know any suggested way to handle different environment variables for the same code base? For example, the mount point of Data Lake for DEV, UAT, and PROD. Any recommendations or best practices? Moreover, how to handle Azure DevOps?