cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

William_Scardua
by Valued Contributor
  • 14846 Views
  • 7 replies
  • 3 kudos

uuid in Merge

Hi guys,I'm trying to use uuid in the merge but I always get an error...import uuid   ( df_events.alias("events").merge( source = df_updates.alias("updates"), condition = "events.cod = updates.cod and events.num = updates.num" ).whenMatch...

  • 14846 Views
  • 7 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @William Scardua​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

  • 3 kudos
6 More Replies
sm1
by New Contributor III
  • 5713 Views
  • 5 replies
  • 3 kudos

New Visualization Tools

How do I add new visualization tool option to my databricks? I don't see a plus sign that will let you choose "Visualization" in my display command results :(. Please help.

Capture
  • 5713 Views
  • 5 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Suky Muliadikara​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.T...

  • 3 kudos
4 More Replies
ramankr48
by Contributor II
  • 4277 Views
  • 2 replies
  • 3 kudos

Issue with identity key column in databricks?

For the identity key I've used both GENERATED ALWAYS AS IDENTITY(start with 1 increment by 1) andGENERATED BY DEFAULT AS IDENTITY(start with 1 increment by 1)but in both cases, if I'm running my script once then it is fine (identity key is working as...

  • 4277 Views
  • 2 replies
  • 3 kudos
Latest Reply
lizou
Contributor III
  • 3 kudos

yes, by default option allow duplicated values per design.I will avoid this option and use only use GENERATED ALWAYS AS IDENTITY Using BY DEFAULT option is worse than not using it at all in BY Default option, If I forget to set starting value, the ID...

  • 3 kudos
1 More Replies
bozhu
by Contributor
  • 2217 Views
  • 1 replies
  • 4 kudos

DLT DataPlaneException

I created an Azure Databricks workspace with my Visual Studio Subsciption, so far everything has been working as expected although I have requested to increase CPU core limit once.I am now getting this "DataPlaneException" error in the DTL during "Wa...

  • 2217 Views
  • 1 replies
  • 4 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 4 kudos

@Bo Zhu​ can we get more error log, looks quota limit exceeded. did you get a chance to check quota in azure portal and see if cores exists for config that you selected. try to select another cluster config and validate

  • 4 kudos
AmineHY
by Contributor
  • 3681 Views
  • 1 replies
  • 4 kudos

My DLT pipeline return ACL Verification Failed

Python Commanddf = spark.read.format('csv').option('sep', ';').option("recursiveFileLookup", "true").load('dbfs:/***/data_files/PREVISIONS/')Here is the content of the folder  Each folder contain the following files: Full logorg.apache.spark.sql.stre...

image image.png
  • 3681 Views
  • 1 replies
  • 4 kudos
Latest Reply
AmineHY
Contributor
  • 4 kudos

Yes some of the files I don't have the right to access (mistakenly) In this case, how do you think I can tell DTL to handle this exception and ignore the file, since I can read some files but not all?

  • 4 kudos
Retko
by Contributor
  • 2181 Views
  • 1 replies
  • 2 kudos

How to jump back to latest positions in the Notebook

Hi,when developing I often need to jump around the Notebook to fix and run things. It would be really helpful so I can jump back to several latest positions (cells), similarly, like in Office Word by SHIFT+F5 key. Is here a way now in Databricks?Than...

  • 2181 Views
  • 1 replies
  • 2 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 2 kudos

@Retko Okter​ go to any of notebook and click on help-->keyboard shortcuts, they will show all possibilities that you need

  • 2 kudos
db-avengers2rul
by Contributor II
  • 2589 Views
  • 2 replies
  • 3 kudos

course code - 'ACAD-INTRO-DELTALAKE' Notebook has errors

Dear DB Team,While following a course from DB Academy course code - 'ACAD-INTRO-DELTALAKE' noticed the notebooks has errors can you please check i have also attached the notebookRegards,Rakesh

  • 2589 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @Rakesh Reddy Gopidi​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from yo...

  • 3 kudos
1 More Replies
BradSheridan
by Valued Contributor
  • 4487 Views
  • 3 replies
  • 4 kudos

Resolved! dropDuplicates

Afternoon Community!! I've done some research today and found multiple, great approaches to accomplish what I'm trying to do, but having trouble understanding exactly which is best suited for my use case.Suppose you're running Auto Loader on S3 and u...

  • 4487 Views
  • 3 replies
  • 4 kudos
Latest Reply
AmanSehgal
Honored Contributor III
  • 4 kudos

If you records are partitioned to narrow down your search, then can you try writing an upsert logic after autoloader code?The upsert logic will insert, update or drop rows as per your conditions.

  • 4 kudos
2 More Replies
kkumar
by New Contributor III
  • 24408 Views
  • 3 replies
  • 7 kudos

Resolved! can we update a Parquet file??

i have copied a table in to a Parquet file now can i update a row or a column in a parquet file without rewriting all the data as the data is huge.using Databricks or ADFThank You

  • 24408 Views
  • 3 replies
  • 7 kudos
Latest Reply
youssefmrini
Databricks Employee
  • 7 kudos

You can only append Data with Parquet that's why you need to convert your parquet table to Delta. It will be much easier.

  • 7 kudos
2 More Replies
Anonymous
by New Contributor III
  • 13384 Views
  • 5 replies
  • 5 kudos

Resolved! Override and Merge mode write using AutoLoader in Databricks

We are reading files using Autoloader in Databricks. Source system is giving full snapshot of complete data in files. So we want to read the data and write in delta table in override mode so all old data is replaced by the new data. Similarly for oth...

  • 13384 Views
  • 5 replies
  • 5 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 5 kudos

@Ranjeet Jaiswal​ ,afaik merge is supported:https://docs.databricks.com/_static/notebooks/merge-in-streaming.htmlThis link does some aggregation but that can be ommitted of course.​The interesting part here is outputMode("update"), and the foreachBat...

  • 5 kudos
4 More Replies
Oliver_Floyd
by Contributor
  • 3861 Views
  • 4 replies
  • 6 kudos

Where to find documentation about : spark.databricks.driver.strace.enabled

Hello ,For a support request, Microsoft support ask me to add spark.databricks.driver.strace.enabled trueto my cluster configuration.MS was not able to send me a link to the documentation and I did not find it on the databricks website.Can someone he...

  • 3861 Views
  • 4 replies
  • 6 kudos
Latest Reply
Oliver_Floyd
Contributor
  • 6 kudos

Yes no problem. I have a python program, called "post ingestion", that run on a databricks job cluster during the night and consist of :inserting data to a deltalake tableexecuting an optimize command on that tableexecuting a vacuum command on that t...

  • 6 kudos
3 More Replies
Dusko
by New Contributor III
  • 2926 Views
  • 2 replies
  • 3 kudos

Resolved! Don't receiving password reset email

Hi, our admin created new user in https://accounts.cloud.databricks.com/ with my email dusan.vystrcil@datasentics.com ​But I didn't received any confirmation email. When I try to sign in and click on "reset password", I still didn't received any emai...

  • 2926 Views
  • 2 replies
  • 3 kudos
Latest Reply
Anonymous
Not applicable
  • 3 kudos

Hi @karthik p​ Thank you for reaching out, and we’re sorry to hear about this log-in issue! We have this Community Edition login troubleshooting post on Community. Please take a look, and follow the troubleshooting steps. If the steps do not resolve ...

  • 3 kudos
1 More Replies
siva_thiru
by Contributor
  • 1642 Views
  • 0 replies
  • 6 kudos

Happy to share that #WAVICLE​  was able to do a hands-on workshop on #[Databricks notebook]​ #[Databricks SQL]​ #[Databricks cluster]​ Fundamentals wi...

Happy to share that #WAVICLE​  was able to do a hands-on workshop on #[Databricks notebook]​ #[Databricks SQL]​ #[Databricks cluster]​ Fundamentals with KCT College, Coimbatore, India.

Workshop Standee
  • 1642 Views
  • 0 replies
  • 6 kudos
Deiry
by New Contributor III
  • 2045 Views
  • 1 replies
  • 3 kudos

Hi I'm Deiry �� I'm 25 (almost 26) years old, I'm a Databricks expert ��  Or at least that's my goal I work at Celerik....

Hi I'm Deiry I'm 25 (almost 26) years old, I'm a Databricks expert Or at least that's my goalI work at Celerik.My goal is to be a certified Machine Learning professional, so here we go

  • 2045 Views
  • 1 replies
  • 3 kudos
Latest Reply
NhatHoang
Valued Contributor II
  • 3 kudos

Very confident, go ahead. :D​

  • 3 kudos
Mado
by Valued Contributor II
  • 3316 Views
  • 3 replies
  • 1 kudos

Resolved! When should I use STREAM() when defining a DLT table?

Hi, I am a little confused when I should use STREAM() when we define a table based on a DLT table. There is a pattern explained in the documentation. CREATE OR REFRESH STREAMING LIVE TABLE streaming_bronze   AS SELECT * FROM cloud_files(   "s3://p...

  • 3316 Views
  • 3 replies
  • 1 kudos
Latest Reply
Mado
Valued Contributor II
  • 1 kudos

Thanks @Landan George​ Since "streaming_silver" is a streaming live table, I expected the last line of the code to be:AS SELECT count(*) FROM STREAM(LIVE.streaming_silver) GROUP BY user_idBut, as you can see the "live_gold" is defined by: AS SELECT c...

  • 1 kudos
2 More Replies
Labels