cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

ramankr48
by Contributor II
  • 11153 Views
  • 3 replies
  • 6 kudos
  • 11153 Views
  • 3 replies
  • 6 kudos
Latest Reply
Anonymous
Not applicable
  • 6 kudos

Hi @Raman Gupta​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

  • 6 kudos
2 More Replies
parulpaul
by New Contributor III
  • 2815 Views
  • 2 replies
  • 7 kudos
  • 2815 Views
  • 2 replies
  • 7 kudos
Latest Reply
parulpaul
New Contributor III
  • 7 kudos

No solution found

  • 7 kudos
1 More Replies
gud4eve
by New Contributor III
  • 6089 Views
  • 5 replies
  • 5 kudos

Resolved! Why is Databricks on AWS cluster start time less than 5 mins and EMR cluster start time is 15 mins?

We are migrating from AWS EMR to Databricks. One thing that we have noticed during the POCs is that Databricks cluster of same size and instance type takes much lesser time to start compared to EMR.My understanding is Databricks also would be request...

  • 6089 Views
  • 5 replies
  • 5 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 5 kudos

@gud4eve​ what kind of cluster you are using, have you configured pools. if not as @Werner Stinckens​ said there might be chance Databricks worked hard to get provisioning of instances in faster way

  • 5 kudos
4 More Replies
Raagavi
by New Contributor
  • 2785 Views
  • 1 replies
  • 1 kudos

Is there a way to read the CSV files automatically from on-premises network locations and write back to the same from Databricks?

Is there a way to read the CSV files automatically from on-premises network locations and write back to the same from Databricks? 

  • 2785 Views
  • 1 replies
  • 1 kudos
Latest Reply
Debayan
Databricks Employee
  • 1 kudos

Hi @Raagavi Rajagopal​ , you can access files on mounted object storage (just an example) or files, please refer: https://docs.databricks.com/files/index.html#access-files-on-mounted-object-storageAnd in the DBFS , CSV files can be read and write fr...

  • 1 kudos
mattjones
by New Contributor II
  • 989 Views
  • 0 replies
  • 1 kudos

Hi all - Matt Jones here, I’m on the Data Streaming team at Databricks and wanted to share a few takeaways from last week’s Current 2022 data streamin...

Hi all - Matt Jones here, I’m on the Data Streaming team at Databricks and wanted to share a few takeaways from last week’s Current 2022 data streaming event (formerly Kafka Summit) in Austin.By far the most common question we got at the booth was ho...

Current 2022 Banner Image
  • 989 Views
  • 0 replies
  • 1 kudos
Ross
by New Contributor II
  • 2212 Views
  • 1 replies
  • 0 kudos

Failed R install package of survminer in Databricks 10.4 LTS

I am trying to install the survminer package but I get a non-zero exit status. It may be due to the jpeg package which is a pre-requisite but this also fails when installing independently.install.packages("survminer", repos = "https://cran.microsoft....

  • 2212 Views
  • 1 replies
  • 0 kudos
Latest Reply
shan_chandra
Databricks Employee
  • 0 kudos

@Ross Hamilton​ - Please follow the below steps in the given orderRun the below init script in an isolated notebook and add the init script to the issue cluster > Advanced options > Init Scripts%python dbutils.fs.put("/tmp/test/init_script.sh",""" #...

  • 0 kudos
Dave_Nithio
by Contributor II
  • 2417 Views
  • 3 replies
  • 0 kudos

Resolved! Data Engineering with Databricks Module 6.3L Error: Autoload CSV

I am currently taking the Data Engineering with Databricks course and have run into an error. I have also attempted this with my own data and had a similar error. In the lab, we are using autoloader to read a spark stream of csv files saved in the DB...

  • 2417 Views
  • 3 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

As a small aside, you don't need the third argument in the structfields

  • 0 kudos
2 More Replies
Sudd
by New Contributor II
  • 2203 Views
  • 1 replies
  • 1 kudos

Permanent UDF in Databricks using Python Wheel

I have a simple Python Program, which takes a Integer as a input and gives a string as a output.I have created the wheel file for this Python code.Then I have uploaded it in the Wheel section of Databricks cluster.After this I want to create a perma...

  • 2203 Views
  • 1 replies
  • 1 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 1 kudos

First, you will need to onboard the unity catalog and sign for Python UDF preview https://www.databricks.com/blog/2022/07/22/power-to-the-sql-people-introducing-python-udfs-in-databricks-sql.htmlBut I doubt it will be possible to use a wheel (but who...

  • 1 kudos
Soma
by Valued Contributor
  • 4606 Views
  • 3 replies
  • 3 kudos

Resolved! Unable to create Key Vault secrets scope with NPIP Workspace

Hi Team for secure connection we created secured cluster withNPIP(https://learn.microsoft.com/en-us/azure/databricks/security/secure-cluster-connectivity) WORKSPACE hosted in a private VNET.We had a hub vnet with private endpoint for key vault ,We pe...

  • 4606 Views
  • 3 replies
  • 3 kudos
Latest Reply
Debayan
Databricks Employee
  • 3 kudos

Hi @somanath Sankaran​ , did you face any error? if yes, could you please paste the error snapshot here?

  • 3 kudos
2 More Replies
Sascha
by New Contributor III
  • 6104 Views
  • 4 replies
  • 2 kudos

Resolved! Unable to connect to Confluent from Databricks

I'm facing the same issue as this post: https://community.databricks.com/s/question/0D58Y00009DE82zSAD/databricks-kafka-read-not-connecting   In my case I'm connecting to Confluent Cloud. I'm able to ping the bootstrap server, I'm able to netstat suc...

  • 6104 Views
  • 4 replies
  • 2 kudos
Latest Reply
Sascha
New Contributor III
  • 2 kudos

Hi @Debayan Mukherjee​ , no I haven't.But with the help of Confluent I changed the statement to the below, and somehow this solved it.inputDF = (spark .readStream .format("kafka") .option("kafka.bootstrap.servers", host) .option("kafka.ssl.en...

  • 2 kudos
3 More Replies
Liza
by New Contributor
  • 765 Views
  • 0 replies
  • 0 kudos

Work that involves Shift work, difficulties with sleep, and varying circumstances It is possible that this Modalert guide may not cover all of the pos...

Work that involves Shift work, difficulties with sleep, and varying circumstances It is possible that this Modalert guide may not cover all of the possible applications for Modalert 200.Modafinil is included in the formulation known as Modalert 200 T...

  • 765 Views
  • 0 replies
  • 0 kudos
Stita
by New Contributor II
  • 3474 Views
  • 1 replies
  • 2 kudos

Resolved! How do we pass the row tags dynamically while reading a XML file into a dataframe?

I have a set of xml files where the row tags change dynamically. How can we achieve this scenario in databricks.df1=spark.read.format('xml').option('rootTag','XRoot').option('rowTag','PL1PLLL').load("dbfs:/FileStore/tables/ins/")We need to pass a val...

  • 3474 Views
  • 1 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

If it is dynamically for the whole file, you can just use variabletag = 'PL1PLLL' df1=spark.read.format('xml').option('rootTag','XRoot').option('rowTag' ,tag).load("dbfs:/FileStore/tables/ins/file.xml")

  • 2 kudos
Taha_Hussain
by Databricks Employee
  • 2683 Views
  • 2 replies
  • 8 kudos

Register for Databricks Office HoursOctober 12: 8:00 - 9:00 AM PT | 3:00 - 4:00 PM GMTOctober 26: 11:00 AM - 12:00 PM PT | 6:00 - 7:00 PM GMT Databric...

Register for Databricks Office HoursOctober 12: 8:00 - 9:00 AM PT | 3:00 - 4:00 PM GMTOctober 26: 11:00 AM - 12:00 PM PT | 6:00 - 7:00 PM GMTDatabricks Office Hours connects you directly with experts to answer all your Databricks questions.Join us to...

  • 2683 Views
  • 2 replies
  • 8 kudos
Latest Reply
Taha_Hussain
Databricks Employee
  • 8 kudos

Here are some of the Questions and Answers from the 10/12 Office Hours (note: certain questions and answers have been condensed for reposting purposes):Q: What is the best approach for moving data from on-prem S3 storage into cloud blob storage into ...

  • 8 kudos
1 More Replies
Carlton
by Contributor
  • 5822 Views
  • 8 replies
  • 1 kudos

Resolved! How to Use the CharIndex with Databricks SQL

When applying the following T-SQL I don't get any errors on MS SQL ServerSELECT DISTINCT *   FROM dbo.account LEFT OUTER JOIN dbo.crm2cburl_lookup ON account.Id = CRM2CBURL_Lookup.[Key] LEFT OUTER JOIN dbo.organizations ON CRM2CBURL_Lookup.CB_UR...

  • 5822 Views
  • 8 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

cross apply is not a function in databricks sql.

  • 1 kudos
7 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels