Data Engineering

Forum Posts

Sorted by:

by AmineHY • Contributor

11-14-2022 4:36:55 AM

3695 Views
1 replies
4 kudos

My DLT pipeline return ACL Verification Failed

Python Commanddf = spark.read.format('csv').option('sep', ';').option("recursiveFileLookup", "true").load('dbfs:/***/data_files/PREVISIONS/')Here is the content of the folder Each folder contain the following files: Full logorg.apache.spark.sql.stre...

Data Engineering

3695 Views
1 replies
4 kudos

11-14-2022 4:36:55 AM

View Replies

Latest Reply

AmineHY
Contributor

11-16-2022 5:17:50 AM

4 kudos

Yes some of the files I don't have the right to access (mistakenly) In this case, how do you think I can tell DTL to handle this exception and ignore the file, since I can read some files but not all?

4 kudos

11-16-2022 5:17:50 AM

by Retko • Contributor

11-16-2022 1:42:03 AM

2193 Views
1 replies
2 kudos

How to jump back to latest positions in the Notebook

Hi,when developing I often need to jump around the Notebook to fix and run things. It would be really helpful so I can jump back to several latest positions (cells), similarly, like in Office Word by SHIFT+F5 key. Is here a way now in Databricks?Than...

Data Engineering

2193 Views
1 replies
2 kudos

11-16-2022 1:42:03 AM

View Replies

Latest Reply

karthik_p
Databricks Partner

11-16-2022 5:17:39 AM

2 kudos

@Retko Okter go to any of notebook and click on help-->keyboard shortcuts, they will show all possibilities that you need

2 kudos

11-16-2022 5:17:39 AM

by db-avengers2rul • Contributor II

10-06-2022 10:51:36 AM

2597 Views
2 replies
3 kudos

course code - 'ACAD-INTRO-DELTALAKE' Notebook has errors

Dear DB Team,While following a course from DB Academy course code - 'ACAD-INTRO-DELTALAKE' noticed the notebooks has errors can you please check i have also attached the notebookRegards,Rakesh

Data Engineering

2597 Views
2 replies
3 kudos

10-06-2022 10:51:36 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-16-2022 3:41:02 AM

3 kudos

Hi @Rakesh Reddy Gopidi Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from yo...

3 kudos

11-16-2022 3:41:02 AM

1 More Replies

by BradSheridan • Databricks Partner

11-14-2022 2:02:08 PM

4520 Views
3 replies
4 kudos

Resolved! dropDuplicates

Afternoon Community!! I've done some research today and found multiple, great approaches to accomplish what I'm trying to do, but having trouble understanding exactly which is best suited for my use case.Suppose you're running Auto Loader on S3 and u...

Data Engineering

4520 Views
3 replies
4 kudos

11-14-2022 2:02:08 PM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

11-14-2022 7:53:17 PM

4 kudos

If you records are partitioned to narrow down your search, then can you try writing an upsert logic after autoloader code?The upsert logic will insert, update or drop rows as per your conditions.

4 kudos

11-14-2022 7:53:17 PM

2 More Replies

by kkumar • Databricks Partner

11-14-2022 11:38:59 PM

24459 Views
3 replies
7 kudos

Resolved! can we update a Parquet file??

i have copied a table in to a Parquet file now can i update a row or a column in a parquet file without rewriting all the data as the data is huge.using Databricks or ADFThank You

Data Engineering

24459 Views
3 replies
7 kudos

11-14-2022 11:38:59 PM

View Replies

Latest Reply

youssefmrini
Databricks Employee

11-15-2022 6:31:30 AM

7 kudos

You can only append Data with Parquet that's why you need to convert your parquet table to Delta. It will be much easier.

7 kudos

11-15-2022 6:31:30 AM

2 More Replies

by Anonymous • New Contributor III

02-03-2022 9:00:43 AM

13441 Views
5 replies
5 kudos

Resolved! Override and Merge mode write using AutoLoader in Databricks

We are reading files using Autoloader in Databricks. Source system is giving full snapshot of complete data in files. So we want to read the data and write in delta table in override mode so all old data is replaced by the new data. Similarly for oth...

Data Engineering

13441 Views
5 replies
5 kudos

02-03-2022 9:00:43 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

02-04-2022 12:06:53 AM

5 kudos

@Ranjeet Jaiswal ,afaik merge is supported:https://docs.databricks.com/_static/notebooks/merge-in-streaming.htmlThis link does some aggregation but that can be ommitted of course.The interesting part here is outputMode("update"), and the foreachBat...

5 kudos

02-04-2022 12:06:53 AM

4 More Replies

by Oliver_Floyd • Contributor

10-17-2022 7:06:10 AM

3898 Views
4 replies
6 kudos

Where to find documentation about : spark.databricks.driver.strace.enabled

Hello ,For a support request, Microsoft support ask me to add spark.databricks.driver.strace.enabled trueto my cluster configuration.MS was not able to send me a link to the documentation and I did not find it on the databricks website.Can someone he...

Data Engineering

3898 Views
4 replies
6 kudos

10-17-2022 7:06:10 AM

View Replies

Latest Reply

Oliver_Floyd
Contributor

10-18-2022 12:59:12 AM

6 kudos

Yes no problem. I have a python program, called "post ingestion", that run on a databricks job cluster during the night and consist of :inserting data to a deltalake tableexecuting an optimize command on that tableexecuting a vacuum command on that t...

6 kudos

10-18-2022 12:59:12 AM

3 More Replies

by Dusko • Databricks Partner

11-15-2022 5:47:18 AM

2942 Views
2 replies
3 kudos

Resolved! Don't receiving password reset email

Hi, our admin created new user in https://accounts.cloud.databricks.com/ with my email dusan.vystrcil@datasentics.com But I didn't received any confirmation email. When I try to sign in and click on "reset password", I still didn't received any emai...

Data Engineering

2942 Views
2 replies
3 kudos

11-15-2022 5:47:18 AM

View Replies

Latest Reply

Anonymous
Not applicable

11-16-2022 1:50:14 AM

3 kudos

Hi @karthik p Thank you for reaching out, and we’re sorry to hear about this log-in issue! We have this Community Edition login troubleshooting post on Community. Please take a look, and follow the troubleshooting steps. If the steps do not resolve ...

3 kudos

11-16-2022 1:50:14 AM

1 More Replies

by siva_thiru • Databricks Partner

11-15-2022 11:50:20 PM

1656 Views
0 replies
6 kudos

Happy to share that #WAVICLE was able to do a hands-on workshop on #[Databricks notebook] #[Databricks SQL] #[Databricks cluster] Fundamentals wi...

Happy to share that #WAVICLE was able to do a hands-on workshop on #[Databricks notebook] #[Databricks SQL] #[Databricks cluster] Fundamentals with KCT College, Coimbatore, India.

Data Engineering

1656 Views
0 replies
6 kudos

11-15-2022 11:50:20 PM

by Deiry • Databricks Partner

11-15-2022 1:53:55 PM

2058 Views
1 replies
3 kudos

Hi I'm Deiry &#xd83d;&#xde0a; I'm 25 (almost 26) years old, I'm a Databricks expert &#xd83d;&#xde0e; Or at least that's my goal I work at Celerik....

Hi I'm Deiry I'm 25 (almost 26) years old, I'm a Databricks expert Or at least that's my goalI work at Celerik.My goal is to be a certified Machine Learning professional, so here we go

Data Engineering

2058 Views
1 replies
3 kudos

11-15-2022 1:53:55 PM

View Replies

Latest Reply

NhatHoang
Valued Contributor II

11-15-2022 11:35:20 PM

3 kudos

Very confident, go ahead. :D

3 kudos

11-15-2022 11:35:20 PM

by Mado • Valued Contributor II

11-15-2022 3:17:36 AM

3352 Views
3 replies
1 kudos

Resolved! When should I use STREAM() when defining a DLT table?

Hi, I am a little confused when I should use STREAM() when we define a table based on a DLT table. There is a pattern explained in the documentation. CREATE OR REFRESH STREAMING LIVE TABLE streaming_bronze AS SELECT * FROM cloud_files( "s3://p...

Data Engineering

3352 Views
3 replies
1 kudos

11-15-2022 3:17:36 AM

View Replies

Latest Reply

Mado
Valued Contributor II

11-15-2022 4:14:55 PM

1 kudos

Thanks @Landan George Since "streaming_silver" is a streaming live table, I expected the last line of the code to be:AS SELECT count(*) FROM STREAM(LIVE.streaming_silver) GROUP BY user_idBut, as you can see the "live_gold" is defined by: AS SELECT c...

1 kudos

11-15-2022 4:14:55 PM

2 More Replies

by hramini • New Contributor

11-15-2022 2:30:12 PM

1469 Views
0 replies
0 kudos

Lakehouse Fundamentals Accredition Badge not appearing

Data Engineering

1469 Views
0 replies
0 kudos

11-15-2022 2:30:12 PM

by kkumar • Databricks Partner

11-14-2022 11:36:07 PM

2282 Views
2 replies
2 kudos

ADLS Gen 2 Delta Tables memory allocation

if i mount a gen2(ADLS 1) to another gen2(ADLS2) account and create a delta table on ADLS2 will it copy the data or just create something link External table.i don't want to duplicate the the data.

Data Engineering

2282 Views
2 replies
2 kudos

11-14-2022 11:36:07 PM

View Replies

Latest Reply

Pat
Esteemed Contributor

11-15-2022 7:16:44 AM

2 kudos

Hi @keerthi kumar ,so basically you can CREATE EXTERNAL TABLES on top of the data stored somewhere - in your case ADLS. Data won't be copied, it will stay where it is, by creating external tables you are actually storing the metadata in your metasto...

2 kudos

11-15-2022 7:16:44 AM

1 More Replies

by Michael42 • New Contributor III

11-11-2022 3:08:58 PM

2529 Views
2 replies
1 kudos

Would like to start a discussion regarding techniques for joining two relatively large tables of roughly equal size on a daily basis. I realize this may be a bit of a conundrum with databricks, but review the details.

Input Data:One batch load of a daily dataset, roughly 10 million items a day of transactions.Another daily batch load of roughly the same size.Each row in one dataset should have a corresponding row in the other dataset.Problem to solve:The problem i...

Data Engineering

2529 Views
2 replies
1 kudos

11-11-2022 3:08:58 PM

View Replies

Latest Reply

Lennart
New Contributor II

11-13-2022 11:13:28 AM

1 kudos

I've dealt with something similar in the past.There was an order system that had order items that was supposed to be matched up against corresponding products in another system that acted as a master and handled invoicing.As for unqiue considerations...

1 kudos

11-13-2022 11:13:28 AM

1 More Replies

by Yaswanth • New Contributor III

11-13-2022 3:34:14 PM

20651 Views
2 replies
12 kudos

Resolved! How can Delta table protocol version be downgraded from higher version to lower version the table properties minReader from 2 to 1 and MaxWriter from 5 to 3.

Is there a possibility to downgrade the Delta Table protocol versions minReader from 2 to 1 and maxWriter from 5 to 3? I have set the TBL properties to 2 and 5 and columnmapping mode to rename the columns in the DeltaTable but the other users are rea...

Data Engineering

20651 Views
2 replies
12 kudos

11-13-2022 3:34:14 PM

View Replies

Latest Reply

youssefmrini
Databricks Employee

11-15-2022 6:24:54 AM

12 kudos

Unfortunately You can't downgrade the version. it's an irreversible operation.

12 kudos

11-15-2022 6:24:54 AM

1 More Replies

Databricks Community

Forum Posts

My DLT pipeline return ACL Verification Failed

How to jump back to latest positions in the Notebook

course code - 'ACAD-INTRO-DELTALAKE' Notebook has errors

Resolved! dropDuplicates

Resolved! can we update a Parquet file??

Resolved! Override and Merge mode write using AutoLoader in Databricks

Where to find documentation about : spark.databricks.driver.strace.enabled

Resolved! Don't receiving password reset email

Happy to share that #WAVICLE was able to do a hands-on workshop on #[Databricks notebook] #[Databricks SQL] #[Databricks cluster] Fundamentals wi...

Hi I'm Deiry &#xd83d;&#xde0a; I'm 25 (almost 26) years old, I'm a Databricks expert &#xd83d;&#xde0e; Or at least that's my goal I work at Celerik....

Resolved! When should I use STREAM() when defining a DLT table?

Lakehouse Fundamentals Accredition Badge not appearing

ADLS Gen 2 Delta Tables memory allocation

Would like to start a discussion regarding techniques for joining two relatively large tables of roughly equal size on a daily basis. I realize this may be a bit of a conundrum with databricks, but review the details.

Resolved! How can Delta table protocol version be downgraded from higher version to lower version the table properties minReader from 2 to 1 and MaxWriter from 5 to 3.

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template