cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Raagavi
by New Contributor
  • 2862 Views
  • 1 replies
  • 1 kudos

Is there a way to read the CSV files automatically from on-premises network locations and write back to the same from Databricks?

Is there a way to read the CSV files automatically from on-premises network locations and write back to the same from Databricks? 

  • 2862 Views
  • 1 replies
  • 1 kudos
Latest Reply
Debayan
Databricks Employee
  • 1 kudos

Hi @Raagavi Rajagopal​ , you can access files on mounted object storage (just an example) or files, please refer: https://docs.databricks.com/files/index.html#access-files-on-mounted-object-storageAnd in the DBFS , CSV files can be read and write fr...

  • 1 kudos
mattjones
by New Contributor II
  • 1040 Views
  • 0 replies
  • 1 kudos

Hi all - Matt Jones here, I’m on the Data Streaming team at Databricks and wanted to share a few takeaways from last week’s Current 2022 data streamin...

Hi all - Matt Jones here, I’m on the Data Streaming team at Databricks and wanted to share a few takeaways from last week’s Current 2022 data streaming event (formerly Kafka Summit) in Austin.By far the most common question we got at the booth was ho...

Current 2022 Banner Image
  • 1040 Views
  • 0 replies
  • 1 kudos
Ross
by New Contributor II
  • 2272 Views
  • 1 replies
  • 0 kudos

Failed R install package of survminer in Databricks 10.4 LTS

I am trying to install the survminer package but I get a non-zero exit status. It may be due to the jpeg package which is a pre-requisite but this also fails when installing independently.install.packages("survminer", repos = "https://cran.microsoft....

  • 2272 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@Ross Hamilton​ - Please follow the below steps in the given orderRun the below init script in an isolated notebook and add the init script to the issue cluster > Advanced options > Init Scripts%python dbutils.fs.put("/tmp/test/init_script.sh",""" #...

  • 0 kudos
Dave_Nithio
by Contributor II
  • 2492 Views
  • 3 replies
  • 0 kudos

Resolved! Data Engineering with Databricks Module 6.3L Error: Autoload CSV

I am currently taking the Data Engineering with Databricks course and have run into an error. I have also attempted this with my own data and had a similar error. In the lab, we are using autoloader to read a spark stream of csv files saved in the DB...

  • 2492 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

As a small aside, you don't need the third argument in the structfields

  • 0 kudos
2 More Replies
Sudd
by New Contributor II
  • 2268 Views
  • 1 replies
  • 1 kudos

Permanent UDF in Databricks using Python Wheel

I have a simple Python Program, which takes a Integer as a input and gives a string as a output.I have created the wheel file for this Python code.Then I have uploaded it in the Wheel section of Databricks cluster.After this I want to create a perma...

  • 2268 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

First, you will need to onboard the unity catalog and sign for Python UDF preview https://www.databricks.com/blog/2022/07/22/power-to-the-sql-people-introducing-python-udfs-in-databricks-sql.htmlBut I doubt it will be possible to use a wheel (but who...

  • 1 kudos
Soma
by Valued Contributor
  • 4734 Views
  • 3 replies
  • 3 kudos

Resolved! Unable to create Key Vault secrets scope with NPIP Workspace

Hi Team for secure connection we created secured cluster withNPIP(https://learn.microsoft.com/en-us/azure/databricks/security/secure-cluster-connectivity) WORKSPACE hosted in a private VNET.We had a hub vnet with private endpoint for key vault ,We pe...

  • 4734 Views
  • 3 replies
  • 3 kudos
Latest Reply
Debayan
Databricks Employee
  • 3 kudos

Hi @somanath Sankaran​ , did you face any error? if yes, could you please paste the error snapshot here?

  • 3 kudos
2 More Replies
Sascha
by New Contributor III
  • 6237 Views
  • 4 replies
  • 2 kudos

Resolved! Unable to connect to Confluent from Databricks

I'm facing the same issue as this post: https://community.databricks.com/s/question/0D58Y00009DE82zSAD/databricks-kafka-read-not-connecting   In my case I'm connecting to Confluent Cloud. I'm able to ping the bootstrap server, I'm able to netstat suc...

  • 6237 Views
  • 4 replies
  • 2 kudos
Latest Reply
Sascha
New Contributor III
  • 2 kudos

Hi @Debayan Mukherjee​ , no I haven't.But with the help of Confluent I changed the statement to the below, and somehow this solved it.inputDF = (spark .readStream .format("kafka") .option("kafka.bootstrap.servers", host) .option("kafka.ssl.en...

  • 2 kudos
3 More Replies
Liza
by New Contributor
  • 800 Views
  • 0 replies
  • 0 kudos

Work that involves Shift work, difficulties with sleep, and varying circumstances It is possible that this Modalert guide may not cover all of the pos...

Work that involves Shift work, difficulties with sleep, and varying circumstances It is possible that this Modalert guide may not cover all of the possible applications for Modalert 200.Modafinil is included in the formulation known as Modalert 200 T...

  • 800 Views
  • 0 replies
  • 0 kudos
Stita
by New Contributor II
  • 3570 Views
  • 1 replies
  • 2 kudos

Resolved! How do we pass the row tags dynamically while reading a XML file into a dataframe?

I have a set of xml files where the row tags change dynamically. How can we achieve this scenario in databricks.df1=spark.read.format('xml').option('rootTag','XRoot').option('rowTag','PL1PLLL').load("dbfs:/FileStore/tables/ins/")We need to pass a val...

  • 3570 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

If it is dynamically for the whole file, you can just use variabletag = 'PL1PLLL' df1=spark.read.format('xml').option('rootTag','XRoot').option('rowTag' ,tag).load("dbfs:/FileStore/tables/ins/file.xml")

  • 2 kudos
Taha_Hussain
by Databricks Employee
  • 2783 Views
  • 2 replies
  • 8 kudos

Register for Databricks Office HoursOctober 12: 8:00 - 9:00 AM PT | 3:00 - 4:00 PM GMTOctober 26: 11:00 AM - 12:00 PM PT | 6:00 - 7:00 PM GMT Databric...

Register for Databricks Office HoursOctober 12: 8:00 - 9:00 AM PT | 3:00 - 4:00 PM GMTOctober 26: 11:00 AM - 12:00 PM PT | 6:00 - 7:00 PM GMTDatabricks Office Hours connects you directly with experts to answer all your Databricks questions.Join us to...

  • 2783 Views
  • 2 replies
  • 8 kudos
Latest Reply
Taha_Hussain
Databricks Employee
  • 8 kudos

Here are some of the Questions and Answers from the 10/12 Office Hours (note: certain questions and answers have been condensed for reposting purposes):Q: What is the best approach for moving data from on-prem S3 storage into cloud blob storage into ...

  • 8 kudos
1 More Replies
Carlton
by Contributor
  • 5980 Views
  • 8 replies
  • 1 kudos

Resolved! How to Use the CharIndex with Databricks SQL

When applying the following T-SQL I don't get any errors on MS SQL ServerSELECT DISTINCT *   FROM dbo.account LEFT OUTER JOIN dbo.crm2cburl_lookup ON account.Id = CRM2CBURL_Lookup.[Key] LEFT OUTER JOIN dbo.organizations ON CRM2CBURL_Lookup.CB_UR...

  • 5980 Views
  • 8 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

cross apply is not a function in databricks sql.

  • 1 kudos
7 More Replies
Sulfikkar
by Contributor
  • 17191 Views
  • 4 replies
  • 3 kudos

Resolved! install a custom python package from azure devops artifact to databricks cluster

I am trying to install a package which was uploaded into the azure devops artifact into the databricks cluster by using pip.conf. Basically below are the steps I followed.(step 1 : install in local IDE)Uploaded the package to azure devops feed using ...

  • 17191 Views
  • 4 replies
  • 3 kudos
Latest Reply
Sulfikkar
Contributor
  • 3 kudos

Thanks for your time @Debayan Mukherjee​  and @Kaniz Fatma​ . We have figured out the issue along with the infra team that we had to do a public ip whitelisting of the databricks clusters in azure.I have checked the ip adress from the Spark cluster U...

  • 3 kudos
3 More Replies
joselita
by New Contributor III
  • 29438 Views
  • 4 replies
  • 8 kudos

AnalysisException: is not a Delta table.

Hello, I changed the DBR from 7.2 to 10.4 and I receive the following error: AnalysisException: is not a Delta table. The table is create , using DELTA. so for sure is a Delta table, even though, I read that I read that from vers. 8 all tables are De...

STG_DATA_LOAD
  • 29438 Views
  • 4 replies
  • 8 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 8 kudos

Hi @JOSELITA MOLTISANTI​ can you run the following commands and share the output? table_name = "stg_data_load" path = spark.sql(f"describe detail {table_name}").select("location").collect()[0][0].replace('dbfs:', '') dbutils.fs.ls(path)

  • 8 kudos
3 More Replies
kfoster
by Contributor
  • 2154 Views
  • 1 replies
  • 0 kudos

Resolved! DLT Pipelines call same table

Orchestration of when DLT runs is handled by Azure Data Factory. There are scenario's a table within a DLT pipeline needs to run on a different schedule.Is there a pipeline configuration option to be set to allow the same table to be ran by two diff...

  • 2154 Views
  • 1 replies
  • 0 kudos
Latest Reply
Vivian_Wilfred
Databricks Employee
  • 0 kudos

Hi @Kristian Foster​ , It should not be possible. Every pipeline owns its table and multiple pipelines cannot write to the same table.

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels