cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

mattjones
by New Contributor II
  • 906 Views
  • 0 replies
  • 1 kudos

Hi all - Matt Jones here, I’m on the Data Streaming team at Databricks and wanted to share a few takeaways from last week’s Current 2022 data streamin...

Hi all - Matt Jones here, I’m on the Data Streaming team at Databricks and wanted to share a few takeaways from last week’s Current 2022 data streaming event (formerly Kafka Summit) in Austin.By far the most common question we got at the booth was ho...

Current 2022 Banner Image
  • 906 Views
  • 0 replies
  • 1 kudos
Ross
by New Contributor II
  • 2095 Views
  • 1 replies
  • 0 kudos

Failed R install package of survminer in Databricks 10.4 LTS

I am trying to install the survminer package but I get a non-zero exit status. It may be due to the jpeg package which is a pre-requisite but this also fails when installing independently.install.packages("survminer", repos = "https://cran.microsoft....

  • 2095 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@Ross Hamilton​ - Please follow the below steps in the given orderRun the below init script in an isolated notebook and add the init script to the issue cluster > Advanced options > Init Scripts%python dbutils.fs.put("/tmp/test/init_script.sh",""" #...

  • 0 kudos
Dave_Nithio
by Contributor II
  • 2283 Views
  • 3 replies
  • 0 kudos

Resolved! Data Engineering with Databricks Module 6.3L Error: Autoload CSV

I am currently taking the Data Engineering with Databricks course and have run into an error. I have also attempted this with my own data and had a similar error. In the lab, we are using autoloader to read a spark stream of csv files saved in the DB...

  • 2283 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

As a small aside, you don't need the third argument in the structfields

  • 0 kudos
2 More Replies
Sudd
by New Contributor II
  • 2113 Views
  • 1 replies
  • 1 kudos

Permanent UDF in Databricks using Python Wheel

I have a simple Python Program, which takes a Integer as a input and gives a string as a output.I have created the wheel file for this Python code.Then I have uploaded it in the Wheel section of Databricks cluster.After this I want to create a perma...

  • 2113 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

First, you will need to onboard the unity catalog and sign for Python UDF preview https://www.databricks.com/blog/2022/07/22/power-to-the-sql-people-introducing-python-udfs-in-databricks-sql.htmlBut I doubt it will be possible to use a wheel (but who...

  • 1 kudos
Soma
by Valued Contributor
  • 4347 Views
  • 3 replies
  • 3 kudos

Resolved! Unable to create Key Vault secrets scope with NPIP Workspace

Hi Team for secure connection we created secured cluster withNPIP(https://learn.microsoft.com/en-us/azure/databricks/security/secure-cluster-connectivity) WORKSPACE hosted in a private VNET.We had a hub vnet with private endpoint for key vault ,We pe...

  • 4347 Views
  • 3 replies
  • 3 kudos
Latest Reply
Debayan
Databricks Employee
  • 3 kudos

Hi @somanath Sankaran​ , did you face any error? if yes, could you please paste the error snapshot here?

  • 3 kudos
2 More Replies
Sascha
by New Contributor III
  • 5937 Views
  • 4 replies
  • 2 kudos

Resolved! Unable to connect to Confluent from Databricks

I'm facing the same issue as this post: https://community.databricks.com/s/question/0D58Y00009DE82zSAD/databricks-kafka-read-not-connecting   In my case I'm connecting to Confluent Cloud. I'm able to ping the bootstrap server, I'm able to netstat suc...

  • 5937 Views
  • 4 replies
  • 2 kudos
Latest Reply
Sascha
New Contributor III
  • 2 kudos

Hi @Debayan Mukherjee​ , no I haven't.But with the help of Confluent I changed the statement to the below, and somehow this solved it.inputDF = (spark .readStream .format("kafka") .option("kafka.bootstrap.servers", host) .option("kafka.ssl.en...

  • 2 kudos
3 More Replies
Liza
by New Contributor
  • 699 Views
  • 0 replies
  • 0 kudos

Work that involves Shift work, difficulties with sleep, and varying circumstances It is possible that this Modalert guide may not cover all of the pos...

Work that involves Shift work, difficulties with sleep, and varying circumstances It is possible that this Modalert guide may not cover all of the possible applications for Modalert 200.Modafinil is included in the formulation known as Modalert 200 T...

  • 699 Views
  • 0 replies
  • 0 kudos
Stita
by New Contributor II
  • 3343 Views
  • 1 replies
  • 2 kudos

Resolved! How do we pass the row tags dynamically while reading a XML file into a dataframe?

I have a set of xml files where the row tags change dynamically. How can we achieve this scenario in databricks.df1=spark.read.format('xml').option('rootTag','XRoot').option('rowTag','PL1PLLL').load("dbfs:/FileStore/tables/ins/")We need to pass a val...

  • 3343 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

If it is dynamically for the whole file, you can just use variabletag = 'PL1PLLL' df1=spark.read.format('xml').option('rootTag','XRoot').option('rowTag' ,tag).load("dbfs:/FileStore/tables/ins/file.xml")

  • 2 kudos
Taha_Hussain
by Databricks Employee
  • 2585 Views
  • 2 replies
  • 8 kudos

Register for Databricks Office HoursOctober 12: 8:00 - 9:00 AM PT | 3:00 - 4:00 PM GMTOctober 26: 11:00 AM - 12:00 PM PT | 6:00 - 7:00 PM GMT Databric...

Register for Databricks Office HoursOctober 12: 8:00 - 9:00 AM PT | 3:00 - 4:00 PM GMTOctober 26: 11:00 AM - 12:00 PM PT | 6:00 - 7:00 PM GMTDatabricks Office Hours connects you directly with experts to answer all your Databricks questions.Join us to...

  • 2585 Views
  • 2 replies
  • 8 kudos
Latest Reply
Taha_Hussain
Databricks Employee
  • 8 kudos

Here are some of the Questions and Answers from the 10/12 Office Hours (note: certain questions and answers have been condensed for reposting purposes):Q: What is the best approach for moving data from on-prem S3 storage into cloud blob storage into ...

  • 8 kudos
1 More Replies
Carlton
by Contributor
  • 5574 Views
  • 8 replies
  • 1 kudos

Resolved! How to Use the CharIndex with Databricks SQL

When applying the following T-SQL I don't get any errors on MS SQL ServerSELECT DISTINCT *   FROM dbo.account LEFT OUTER JOIN dbo.crm2cburl_lookup ON account.Id = CRM2CBURL_Lookup.[Key] LEFT OUTER JOIN dbo.organizations ON CRM2CBURL_Lookup.CB_UR...

  • 5574 Views
  • 8 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

cross apply is not a function in databricks sql.

  • 1 kudos
7 More Replies
Sulfikkar
by Contributor
  • 16317 Views
  • 4 replies
  • 3 kudos

Resolved! install a custom python package from azure devops artifact to databricks cluster

I am trying to install a package which was uploaded into the azure devops artifact into the databricks cluster by using pip.conf. Basically below are the steps I followed.(step 1 : install in local IDE)Uploaded the package to azure devops feed using ...

  • 16317 Views
  • 4 replies
  • 3 kudos
Latest Reply
Sulfikkar
Contributor
  • 3 kudos

Thanks for your time @Debayan Mukherjee​  and @Kaniz Fatma​ . We have figured out the issue along with the infra team that we had to do a public ip whitelisting of the databricks clusters in azure.I have checked the ip adress from the Spark cluster U...

  • 3 kudos
3 More Replies
joselita
by New Contributor III
  • 28400 Views
  • 4 replies
  • 8 kudos

AnalysisException: is not a Delta table.

Hello, I changed the DBR from 7.2 to 10.4 and I receive the following error: AnalysisException: is not a Delta table. The table is create , using DELTA. so for sure is a Delta table, even though, I read that I read that from vers. 8 all tables are De...

STG_DATA_LOAD
  • 28400 Views
  • 4 replies
  • 8 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 8 kudos

Hi @JOSELITA MOLTISANTI​ can you run the following commands and share the output? table_name = "stg_data_load" path = spark.sql(f"describe detail {table_name}").select("location").collect()[0][0].replace('dbfs:', '') dbutils.fs.ls(path)

  • 8 kudos
3 More Replies
kfoster
by Contributor
  • 1982 Views
  • 1 replies
  • 0 kudos

Resolved! DLT Pipelines call same table

Orchestration of when DLT runs is handled by Azure Data Factory. There are scenario's a table within a DLT pipeline needs to run on a different schedule.Is there a pipeline configuration option to be set to allow the same table to be ran by two diff...

  • 1982 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vivian_Wilfred
Databricks Employee
  • 0 kudos

Hi @Kristian Foster​ , It should not be possible. Every pipeline owns its table and multiple pipelines cannot write to the same table.

  • 0 kudos
StephanieAlba
by Databricks Employee
  • 10543 Views
  • 6 replies
  • 9 kudos

Resolved! How do I kick off Azure Data Factory from within Databricks?

I want to kick off ingestion in ADF from Databricks. When ADF ingestion is done, my DBX bronze-silver-gold pipeline follows within DBX.I see it is possible to call Databricks notebooks from ADF. Can I also go the other way? I want to start the ingest...

  • 10543 Views
  • 6 replies
  • 9 kudos
Latest Reply
KKo
Contributor III
  • 9 kudos

Are you looking to pass output of databricks notebook to ADF?

  • 9 kudos
5 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels