cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

yitao
by New Contributor III
  • 1635 Views
  • 6 replies
  • 11 kudos

Resolved! How to make sparklyr extension work with Databricks runtime?

Hello. I'm the current maintainer of sparklyr (a R interface for Apache Spark) and a few sparklyr extensions such as sparklyr.flint.Sparklyr was fortunate to receive some contribution from Databricks folks, which enabled R users to run `spark_connect...

  • 1635 Views
  • 6 replies
  • 11 kudos
Latest Reply
Kaniz
Community Manager
  • 11 kudos

Hi @yitao​ , Just a friendly follow-up. Do you still need help, or does the above response help you to find the solution? Please let us know.

  • 11 kudos
5 More Replies
Hubert-Dudek
by Esteemed Contributor III
  • 1361 Views
  • 5 replies
  • 18 kudos

Resolved! Azure: Permanently purge cluster logs

Is there any way to purge logs via API instead of clicking daily that option:

image.png
  • 1361 Views
  • 5 replies
  • 18 kudos
Latest Reply
Kaniz
Community Manager
  • 18 kudos

Hi @Hubert Dudek​ â€‹ , Just a friendly follow-up. Do you still need help, or @Prabakar Ammeappin​'s response help you to find the solution? Please let us know.

  • 18 kudos
4 More Replies
BorislavBlagoev
by Valued Contributor III
  • 2194 Views
  • 3 replies
  • 5 kudos

Resolved! Get package from Nexus repo.

I want to receive a package from Nexus repo both in notebook and job. If anyone has experience with this, please answer me here!

  • 2194 Views
  • 3 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @Borislav Blagoev​ , Just a friendly follow-up. Do you still need help, or does the above response help you to find the solution? Please let us know.

  • 5 kudos
2 More Replies
soundari
by New Contributor
  • 1115 Views
  • 3 replies
  • 1 kudos

Resolved! Identify the partitionValues written yesterday from delta

We have a streaming data written into delta. We will not write all the partitions every day. Hence i am thinking of running compact spark job, to run only on partitions that has been modified yesterday. Is it possible to query the partitionsValues wr...

  • 1115 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Gnanasoundari Soundarajan​  , Just a friendly follow-up. Do you still need help, or @Deepak Bhutada​ 's response help you to find the solution? Please let us know.

  • 1 kudos
2 More Replies
narek_margaryan
by New Contributor II
  • 1447 Views
  • 3 replies
  • 3 kudos

Resolved! Do Spark nodes read data from storage in a sequence?

I'm new to Spark and trying to understand how some of its components work.I understand that once the data is loaded into the memory of separate nodes, they process partitions in parallel, within their own memory (RAM).But I'm wondering whether the in...

  • 1447 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @Narek Margaryan​, Just a friendly follow-up. Do you still need help, or does the above response help you to find the solution? Please let us know.

  • 3 kudos
2 More Replies
brendan-b
by New Contributor II
  • 6534 Views
  • 4 replies
  • 3 kudos

spark-xml not working with Databricks Connect and Pyspark

Hi all,I currently have a cluster configured in databricks with spark-xml (version com.databricks:spark-xml_2.12:0.13.0) which was installed using Maven. The spark-xml library itself works fine with Pyspark when I am using it in a notebook within th...

  • 6534 Views
  • 4 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @Brendan Banfield​ , This article describes how to read and write an XML file as an Apache Sparkâ„¢ data source.

  • 3 kudos
3 More Replies
User16783855534
by New Contributor III
  • 5303 Views
  • 6 replies
  • 5 kudos
  • 5303 Views
  • 6 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @Neil Patel​ â€‹ , Just a friendly follow-up. Do you still need help, or do the above responses help you find the solution? Please let us know.

  • 5 kudos
5 More Replies
dataslicer
by Contributor
  • 1779 Views
  • 3 replies
  • 2 kudos

Resolved! upgraded R package rlang to 0.4.11 on DBR 8.3 SC, but sessionInfo() still shows rlang as 0.4.9

I am using Azure Databricks Runtime (DBR) 8.3 ML with Python notebook and R cells together.I want to use "tidyverse" and one of the dependency is rlang >= 0.4.10 and the base DBR 8.3 ML provides rlang @ 0.4.9. I successfully upgraded the R package t...

  • 1779 Views
  • 3 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Jim Huang​ â€‹ , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.

  • 2 kudos
2 More Replies
delta_lake
by New Contributor
  • 1066 Views
  • 3 replies
  • 1 kudos

Delta Lake Python

I have setup a virtual environment inside my existing hadoop cluster. Since the current cluster does not have spark >3 , so i installed delta spark using virtual environment. While trying to access the hdfs which is kerberose one, Getting below error...

  • 1066 Views
  • 3 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Vasanth P​ â€‹ , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.

  • 1 kudos
2 More Replies
IkramMecheri
by New Contributor II
  • 7256 Views
  • 5 replies
  • 2 kudos

ImportError: No module named 'bs4'

Hi, I would like to do some web scrapping, however I am unable to import the libraries I traditionally use for that task import requests from bs4 import BeautifulSoup

  • 7256 Views
  • 5 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Ikram Mecheri​ â€‹ , Just a friendly follow-up. Do you still need help, or do the above responses help you find the solution? Please let us know.

  • 2 kudos
4 More Replies
User16868770416
by Contributor
  • 978 Views
  • 4 replies
  • 2 kudos
  • 978 Views
  • 4 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Will Block​ , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.

  • 2 kudos
3 More Replies
Zen
by New Contributor III
  • 2296 Views
  • 9 replies
  • 2 kudos

Resolved! How do I run a scala script from the Terminal

Hello, how do I run a scala script from a Terminal on Databricks - Web Terminal, or from a cell with %sh just doing `scala -nc script.scala` is not working.Thanks,

  • 2296 Views
  • 9 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Zen)​, Just a friendly follow-up. Do you still need help, or @DARSHAN BARGAL​ 's response help you to find the solution? Please let us know.

  • 2 kudos
8 More Replies
Alex_G
by New Contributor II
  • 1106 Views
  • 3 replies
  • 5 kudos

Resolved! Databricks Feature Store in MLFlow run CLI command

Hello!I am attempting to move some machine learning code from a databricks notebook into a mlflow git repository. I am utilizing the databricks feature store to load features that have been processed. Currently I cannot get the databricks library to ...

  • 1106 Views
  • 3 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @Alex Graff​  , Just a friendly follow-up. Do you still need help, or @Sean Owen​ 's response help you to find the solution? Please let us know.

  • 5 kudos
2 More Replies
NickGoodfella
by New Contributor
  • 909 Views
  • 2 replies
  • 1 kudos

DNS_Analytics Notebook Problems

Hello everyone! First post on the forums, been stuck at this for awhile now and cannot seem to understand why this is happening. Basically, I have been using a seems to be premade Databricks notebook from Databricks themselves for a DNS Analytics exa...

  • 909 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @NickGoodfella​ , Just a friendly follow-up. Do you still need help, or @Sean Owen​'s response help you to find the solution? Please let us know.

  • 1 kudos
1 More Replies
EricOX
by New Contributor
  • 2746 Views
  • 3 replies
  • 3 kudos

Resolved! How to handle configuration for different environment (e.g. DEV, PROD)?

May I know any suggested way to handle different environment variables for the same code base? For example, the mount point of Data Lake for DEV, UAT, and PROD. Any recommendations or best practices? Moreover, how to handle Azure DevOps?

  • 2746 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @Eric Yeung​  , Just a friendly follow-up. Do you still need help or the above responses help you to find the solution? Please let us know.

  • 3 kudos
2 More Replies
Labels
Top Kudoed Authors