Data Engineering

Forum Posts

Sorted by:

Start a conversation

by VD10 • New Contributor

06-27-2023 5:58:23 PM

951 Views
1 replies
0 kudos

Data Engineering Professional Certificate

On the way to obtain the certificate. Any preparing tips would be appreciated! Thanks!

Data Engineering

951 Views
1 replies
0 kudos

06-27-2023 5:58:23 PM

View Replies

Latest Reply

dplante
Contributor II

07-23-2023 9:26:18 PM

0 kudos

Disclaimer - I haven't taken this exam yet A couple of suggestions (from this forum, google searches, etc):- check out this blog post - https://medium.com/@sjrusso/passing-the-databricks-professional-data-engineer-exam-115cccc90aba#:~:text=I%20recent...

0 kudos

07-23-2023 9:26:18 PM

by KKo • Contributor III

03-28-2022 12:47:46 PM

9523 Views
4 replies
2 kudos

Resolved! Union Multiple dataframes in loop, with different schema

With in a loop I have few dataframes created. I can union them with out an issue if they have same schema using (df_unioned = reduce(DataFrame.unionAll, df_list). Now my problem is how to union them if one of the dataframe in df_list has different nu...

Data Engineering

9523 Views
4 replies
2 kudos

03-28-2022 12:47:46 PM

View Replies

Latest Reply

anoopunni
New Contributor II

07-23-2023 8:47:55 PM

2 kudos

Hi,I have come across same scenario, using reduce() and unionByname we can implement the solution as below:val lstDF: List[Datframe] = List(df1,df2,df3,df4,df5)val combinedDF = lstDF.reduce((df1, df2) => df1.unionByName(df2, allowMissingColumns = tru...

2 kudos

07-23-2023 8:47:55 PM

3 More Replies

by VikashKumar • New Contributor

07-23-2023 1:26:31 PM

6605 Views
0 replies
0 kudos

Is there any way to convert delta share short-lived presigned URLs to CSV files at Client End

Hello All, I have requirement , where I need to disclose the data at the client end and they are suppose to access the data in CSV format. I am planning to use Delta Sharing integrated with Unity Catalog. As we know, according to Delta sharing protoc...

Data Engineering

6605 Views
0 replies
0 kudos

07-23-2023 1:26:31 PM

by 180122 • New Contributor II

07-19-2023 1:02:53 PM

1557 Views
3 replies
1 kudos

Data Engineering Professional - Practice exam?

Hi, when will we get Practice Exams for this the Data Engineering Professional Certification Exam? It seems like we already have it for a good amount of the associate exams, and this Professional exam seems more difficult than the associate ones, so ...

Data Engineering

1557 Views
3 replies
1 kudos

07-19-2023 1:02:53 PM

View Replies

Latest Reply

Anonymous
Not applicable

07-22-2023 9:55:20 PM

1 kudos

Hi @180122 Hope you are well. Just wanted to see if you were able to find an answer to your question and would you like to mark an answer as best? It would be really helpful for the other members too. Cheers!

1 kudos

07-22-2023 9:55:20 PM

2 More Replies

by bradleyjamrozik • New Contributor III

07-20-2023 1:03:58 PM

2053 Views
3 replies
3 kudos

Resolved! Questions about Lineage and DLT

Hey there!1. Does column lineage work across multiple catalogs and schemas?2. Do Delta Live Tables support lineage? If yes does that work across multiple pipelines or only with a single one?

Data Engineering

2053 Views
3 replies
3 kudos

07-20-2023 1:03:58 PM

View Replies

Latest Reply

Anonymous
Not applicable

07-22-2023 9:39:25 PM

3 kudos

Hi @bradleyjamrozik We haven't heard from you since the last response from @Vinay_M_R and @erigaud , and I was checking back to see if her suggestions helped you. Or else, If you have any solution, please share it with the community, as it can be he...

3 kudos

07-22-2023 9:39:25 PM

2 More Replies

by YS1 • New Contributor III

07-20-2023 3:24:22 PM

1312 Views
3 replies
1 kudos

Updating tables from SQL Server to Databricks

Hi,I have SQL Server tables which are the primary location for all live transactions happen and currently I read them through pyspark as dataframes and overwrite them everyday to have the latest copy of them in Databricks. The problem is it takes lon...

Data Engineering

1312 Views
3 replies
1 kudos

07-20-2023 3:24:22 PM

View Replies

Latest Reply

Anonymous
Not applicable

07-22-2023 9:37:42 PM

1 kudos

Hi @YS1 Hope you are well. Just wanted to see if you were able to find an answer to your question and would you like to mark an answer as best? It would be really helpful for the other members too. Cheers!

1 kudos

07-22-2023 9:37:42 PM

2 More Replies

by samuraidjakk • New Contributor II

07-21-2023 4:08:28 AM

1239 Views
2 replies
1 kudos

Resolved! Lineage from Unity Catalog on GCP

We are in the prosess of trying to do a PoC of our pipelines using DLT. Normally, we use another tool and we have created a custom program to extract lineage. We want to try to get / display lineage using Unity Catalog.But.. we are on GCP, and it see...

Data Engineering

1239 Views
2 replies
1 kudos

07-21-2023 4:08:28 AM

View Replies

Latest Reply

Anonymous
Not applicable

07-22-2023 9:12:38 PM

1 kudos

Hi @samuraidjakk Just wanted to check in if you were able to resolve your issue. If yes, would you be happy to mark an answer as best? If not, please tell us so we can help you. Thanks!

1 kudos

07-22-2023 9:12:38 PM

1 More Replies

by Hein • New Contributor III

07-21-2023 7:22:05 AM

396 Views
0 replies
0 kudos

SAT - Security Analysis Tool - Workspaces not displaying on the dashboard

Hi,After following the instructions and video, I still don't see the my other workspaces on the dashboard. I double checked my permissions on the service principle and still the same issue.Please can you assist.Thank You

Data Engineering

396 Views
0 replies
0 kudos

07-21-2023 7:22:05 AM

by SRK • Contributor III

12-21-2022 8:36:58 AM

4683 Views
6 replies
5 kudos

Resolved! How to deploy Databricks SQL queries and SQL Alerts from lower environment to higher environment?

We are using Databricks SQL Alerts to handle one scenario. We have written the queries for the same, also we have created the SQL Alert. However, I was looking for the best way to deploy it on Higher Environments like Pre-Production and Production.I ...

Data Engineering

4683 Views
6 replies
5 kudos

12-21-2022 8:36:58 AM

View Replies

Latest Reply

valeryuaba
New Contributor III

07-21-2023 5:15:00 AM

5 kudos

Thanks!

5 kudos

07-21-2023 5:15:00 AM

5 More Replies

by erigaud • Honored Contributor

07-19-2023 2:39:42 AM

6112 Views
7 replies
4 kudos

Resolved! Autoloader Excel Files

Hello, I looked at the documentation but could not find what I wanted. Is there a way to load Excel files using an autoloader and if yes, what options should be given to specify format, sheet name etc ? Thank you friends !

Data Engineering

6112 Views
7 replies
4 kudos

07-19-2023 2:39:42 AM

View Replies

Latest Reply

Hemant
Valued Contributor II

07-19-2023 4:43:40 AM

4 kudos

Unfortunately, Databricks autoloader doesn't support Excel file types to incrementally load new files.Link:https://docs.databricks.com/ingestion/auto-loader/options.html If your excel file contains a single sheet then there is a workaround.

4 kudos

07-19-2023 4:43:40 AM

6 More Replies

by sumit23 • New Contributor

07-20-2023 10:55:52 PM

947 Views
0 replies
0 kudos

[Error] [SECRET_FUNCTION_INVALID_LOCATION]: While running secret function with create or replace

Hi, recently we made an upgrade to our databricks warehouse, transitioning from SQL Classic to SQL PRO.However, we started encountering the following error message when attempting to execute the "CREATE or REPLACE" table query with the secret functio...

Data Engineering

947 Views
0 replies
0 kudos

07-20-2023 10:55:52 PM

by BasavarajAngadi • Contributor

03-13-2022 9:30:52 PM

2391 Views
4 replies
1 kudos

Resolved! Question on Transaction logs and versioning in data bricks ?

Hi Experts ,No doubt data bricks supports ACID properties. What when it comes to versioning how much such versions will data bricks captures ? For Example : If i do any DML operations on top of Delta tables every time when i do it captures the tran...

Data Engineering

2391 Views
4 replies
1 kudos

03-13-2022 9:30:52 PM

View Replies

Latest Reply

stefnhuy
New Contributor III

07-20-2023 4:15:14 AM

1 kudos

Hey,As a data enthusiast myself, I find this topic quite intriguing. Data Bricks indeed does a fantastic job in supporting ACID properties, ensuring data integrity, and allowing for versioning.To address BasavarajAngadi's question, Data Bricks effici...

1 kudos

07-20-2023 4:15:14 AM

3 More Replies

by PrithwisMukerje • New Contributor II

06-05-2017 3:23:12 AM

78614 Views
5 replies
4 kudos

Resolved! How to download a file from dbfs to my local computer filesystem?

I have run the WordCount program and have saved the output into a directory as follows counts.saveAsTextFile("/users/data/hobbit-out1") subsequently I check that the output directory contains the expected number of files %fs ls /users/data/hobbit-ou...

Data Engineering

78614 Views
5 replies
4 kudos

06-05-2017 3:23:12 AM

View Replies

Latest Reply

Kaniz_Fatma
Community Manager

07-20-2023 12:03:30 AM

4 kudos

@PrithwisMukerje , To download a file from DBFS to your local computer filesystem, you can use the Databricks CLI command databricks fs cp. Here are the steps: 1. Open a terminal or command prompt on your local computer.2. Run the follow...

4 kudos

07-20-2023 12:03:30 AM

4 More Replies

by hamzatazib96 • New Contributor III

08-18-2021 9:11:46 AM

52434 Views
28 replies
12 kudos

Resolved! Read file from dbfs with pd.read_csv() using databricks-connect

Hello all, As described in the title, here's my problem: 1. I'm using databricks-connect in order to send jobs to a databricks cluster 2. The "local" environment is an AWS EC2 3. I want to read a CSV file that is in DBFS (databricks) with pd.read_cs...

Data Engineering

52434 Views
28 replies
12 kudos

08-18-2021 9:11:46 AM

View Replies

Latest Reply

so16
New Contributor II

07-19-2023 1:13:17 PM

12 kudos

Please guys I need your help, I got the same issue still after readed all your comments.I am using Databricks-connect(version 13.1) on pycharm and trying to load file that are on the dbfs storage.spark = DatabricksSession.builder.remote( host=host...

12 kudos

07-19-2023 1:13:17 PM

27 More Replies

by dataengineer17 • New Contributor II

07-16-2023 9:01:08 AM

8913 Views
6 replies
3 kudos

Databricks execution failed with error state: InternalError, error message: failed to update run

I am receiving this error Databricks execution failed with error state: InternalError, error message: failed to update run GlobalRunId(xx,RunId(yy))This is appears as an error message in azure data factory when I use it to schedule a databricks noteb...

Data Engineering

8913 Views
6 replies
3 kudos

07-16-2023 9:01:08 AM

View Replies

Latest Reply

saipujari_spark
Valued Contributor

07-19-2023 12:52:45 PM

3 kudos

@dataengineer17 It could be coming from the internal jobs service, If the issue persists I would recommend creating a support ticket.

3 kudos

07-19-2023 12:52:45 PM

5 More Replies

User

Count

1603

744

348

285

247

Databricks Community

Forum Posts

Data Engineering Professional Certificate

Resolved! Union Multiple dataframes in loop, with different schema

Is there any way to convert delta share short-lived presigned URLs to CSV files at Client End

Data Engineering Professional - Practice exam?

Resolved! Questions about Lineage and DLT

Updating tables from SQL Server to Databricks

Resolved! Lineage from Unity Catalog on GCP

SAT - Security Analysis Tool - Workspaces not displaying on the dashboard

Resolved! How to deploy Databricks SQL queries and SQL Alerts from lower environment to higher environment?

Resolved! Autoloader Excel Files

[Error] [SECRET_FUNCTION_INVALID_LOCATION]: While running secret function with create or replace

Resolved! Question on Transaction logs and versioning in data bricks ?

Resolved! How to download a file from dbfs to my local computer filesystem?

Resolved! Read file from dbfs with pd.read_csv() using databricks-connect

Databricks execution failed with error state: InternalError, error message: failed to update run

Compute Policy Does Not Install Libraries

Is there a way to let the DLT pipeline retry by it...

Can't create Catalog on Databricks on AWS

Executing Notebooks - Run All Cells vs Run All Bel...

getting Status code: 301 Moved Permanently error