Data Engineering

Forum Posts

Sorted by:

by osoucy • New Contributor II

09-08-2022 10:10:56 AM

1769 Views
0 replies
1 kudos

Is it possible to join two aggregated streams of data?

ObjectiveWithin the context of a delta live table, I'm trying to merge two streams aggregation, but run into challenges. Is it possible to achieve such a join?ContextAssume- table trades stores a list of trades with their associated time stamps- tabl...

Data Engineering

1769 Views
0 replies
1 kudos

09-08-2022 10:10:56 AM

by Aran_Oribu • New Contributor II

09-08-2022 3:43:52 AM

7019 Views
5 replies
2 kudos

Resolved! Create and update a csv/json file in ADLSG2 with Eventhub in Databricks streaming

Hello ,This is my first post here and I am a total beginner with DataBricks and spark.Working on an IoT Cloud project with azure , I'm looking to set up a continuous stream processing of data.A current architecture already exists thanks to Stream Ana...

Data Engineering

7019 Views
5 replies
2 kudos

09-08-2022 3:43:52 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

09-08-2022 3:48:23 AM

2 kudos

So the event hub creates files (json/csv) on adls.You can read those files into databricks with the spark.read.csv/json method. If you want to read many files in one go, you can use wildcards.f.e. spark.read.json("/mnt/datalake/bronze/directory/*/*...

2 kudos

09-08-2022 3:48:23 AM

4 More Replies

by jacob1 • Databricks Partner

08-09-2022 10:10:27 AM

1821 Views
1 replies
1 kudos

I passed my DE associate exam, but unable to see/download my certificate on credentials.databricks.com. Can someone help download the certificate - this is time sensitive

I passed my DE associate exam, but unable to see/download my certificate on credentials.databricks.com. I am using the same email as the one on Kryterion on webassessor.com/databricks.I can log invto Kryterion and see that I have passed the exam

Data Engineering

1821 Views
1 replies
1 kudos

08-09-2022 10:10:27 AM

View Replies

Latest Reply

Vidula
Databricks Partner

09-08-2022 3:33:29 AM

1 kudos

Hi @jacob stallone Thank you for reaching out!Let us look into this for you, and we will get back to you.

1 kudos

09-08-2022 3:33:29 AM

by PChan • New Contributor II

09-07-2022 10:29:16 PM

1647 Views
1 replies
0 kudos

www.googleapis.com

It happens after databricks deleted my cluster{ "protoPayload": { "@type": "type.googleapis.com/google.cloud.audit.AuditLog", "status": {}, "serviceName": "container.googleapis.com", "methodName": "google.container.v1.ClusterMa...

Data Engineering

1647 Views
1 replies
0 kudos

09-07-2022 10:29:16 PM

View Replies

Latest Reply

PChan
New Contributor II

09-07-2022 10:33:09 PM

0 kudos

attached the error log.

0 kudos

09-07-2022 10:33:09 PM

by Anonymous • Not applicable

09-07-2022 12:47:31 PM

2913 Views
1 replies
5 kudos

www.linkedin.com

September 2022 Featured Member Interview Aman Sehgal - @AmanSehgal Pronouns: He, Him Company: CyberCXJob Title: Senior Data Engineer Could you give a brief description of your professional journey to date? A. I started my career as software develope...

Data Engineering

2913 Views
1 replies
5 kudos

09-07-2022 12:47:31 PM

View Replies

Latest Reply

AmanSehgal
Honored Contributor III

09-07-2022 8:46:33 PM

5 kudos

Thank you @Lindsay Olson and @Christy Seto for interviewing me and nominating me as this months featured member. It's a pleasure to be member of Databricks community and I'm looking forward to contribute more in future.To all the community members...

5 kudos

09-07-2022 8:46:33 PM

by Vickyster • New Contributor II

09-07-2022 8:33:36 PM

1875 Views
0 replies
0 kudos

Column partitioning is not working in delta live table when `columnMapping` table property is enabled.

I'm trying to create delta live table on top of json files placed in azure blob. The json files contains white spaces in column names instead of renaming I tried `columnMapping` table property which let me create the table with spaces but the column ...

Data Engineering

1875 Views
0 replies
0 kudos

09-07-2022 8:33:36 PM

by bblakey • New Contributor II

08-23-2022 2:19:03 PM

2909 Views
1 replies
1 kudos

Recommendations for loading table from two different folder paths using Autoloader and DLT

I have a new (bronze) table that I want to write to - the initial table load (refresh) csv file is placed in folder a, the incremental changes (inserts/updates/deletes) csv files are placed in folder b. I've written a notebook that can load one OR t...

Data Engineering

2909 Views
1 replies
1 kudos

08-23-2022 2:19:03 PM

View Replies

by akdm • Contributor

09-02-2022 8:20:57 AM

4576 Views
3 replies
1 kudos

Resolved! FileNotFoundError when using sftp to write to disk within jobs

When I try to convert a notebook into a job I frequently run into an issue with writing to the local filesystem. For this particular example, I did all my notebook testing with a bytestream for small files. When I tried to run as a job, I used the me...

Data Engineering

4576 Views
3 replies
1 kudos

09-02-2022 8:20:57 AM

View Replies

Latest Reply

akdm
Contributor

09-07-2022 9:10:42 AM

1 kudos

I was able to fix it. It was an issue with the nested files on the SFTP. I had to ensure that the parent folders were being created as well. Splitting out the local path and file made it easier to ensure that it existed with os.path.exists() and os.m...

1 kudos

09-07-2022 9:10:42 AM

2 More Replies

by Pritesh1 • New Contributor II

08-04-2022 12:19:08 PM

6138 Views
3 replies
0 kudos

Resolved! Ganglia UI not showing visuals

Hello, I am trying to use Metrics and Ganglia UI to monitor the state of my clusters better. But, I am seeing that the visuals are not coming up. I have tried opening on Chrome and microsoft edge, it shows same. Is there something that I need to inst...

Data Engineering

6138 Views
3 replies
0 kudos

08-04-2022 12:19:08 PM

View Replies

Latest Reply

Pritesh1
New Contributor II

09-07-2022 7:48:58 AM

0 kudos

I dont exactly know what was the issue. But, it seems to be related to some kind of network security. Apparently, my IT team had set up a separate vm and making the changes for that specific vm to be able to use Ganglia from there. I end up RDP into ...

0 kudos

09-07-2022 7:48:58 AM

2 More Replies

by Sparks • New Contributor III

08-08-2022 8:14:33 PM

5588 Views
4 replies
1 kudos

Resolved! Delta Live Table - How to pass OPTION "ignoreChanges" using SQL?

I am running a Delta Live Pipeline that explodes JSON docs into small Delta Live Tables. The docs can receive multiple updates over the lifecycle of the transaction. I am curating the data via medallion architecture, when I run an API /update with {"...

Data Engineering

5588 Views
4 replies
1 kudos

08-08-2022 8:14:33 PM

View Replies

Latest Reply

Vidula
Databricks Partner

09-07-2022 5:58:52 AM

1 kudos

Hey there @Danny Aguirre Does @Prabakar Ammeappin response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

1 kudos

09-07-2022 5:58:52 AM

3 More Replies

by dslin • Databricks Partner

08-08-2022 7:48:05 PM

3847 Views
3 replies
2 kudos

How to deploy a python script with dependencies by dbx?

Hi,I'm quite new here. I'm trying to perform a deployment of python file with dbx command. The file contains libraries to be installed. How may I deploy the file (together with its dependencies) to databricks?Here are the commands I currently run:`db...

Data Engineering

3847 Views
3 replies
2 kudos

08-08-2022 7:48:05 PM

View Replies

Latest Reply

Vidula
Databricks Partner

09-07-2022 5:57:28 AM

2 kudos

Hi @Di Lin Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

2 kudos

09-07-2022 5:57:28 AM

2 More Replies

by bindan • New Contributor II

08-07-2022 9:46:38 PM

8080 Views
3 replies
3 kudos

Bootstrap Timeout during cluster start - Azure Data bricks

When I created a cluster on a new deployed Azure data bricks , It’s not starting and giving below message "Bootstrap Timeout" Please try again later, Instance bootstrap Timeout Failure message: Bootstrap script took too long and timeout. please try a...

Data Engineering

8080 Views
3 replies
3 kudos

08-07-2022 9:46:38 PM

View Replies

Latest Reply

Vidula
Databricks Partner

09-07-2022 5:49:36 AM

3 kudos

Hi @Bin Ep Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!

3 kudos

09-07-2022 5:49:36 AM

2 More Replies

by Anonymous • Not applicable

08-05-2022 10:17:50 AM

16464 Views
4 replies
0 kudos

Filter data by Date using where condition (< TargetDate) giving "Query returned no results"

Code is working good if data greater than target date (>) is selected :SELECT xyz.ID,xyz.Gender,xyz.geography,xyz.code,xyz.delivery_status,abc.department_codeFROM v.table1 as xyzleft join y.table2 as abconxyz.ID = abc.ID AND xyz.code = abc.cod...

Data Engineering

16464 Views
4 replies
0 kudos

08-05-2022 10:17:50 AM

View Replies

Latest Reply

Vidula
Databricks Partner

09-07-2022 5:15:38 AM

0 kudos

Hi @Rishabh Shankar Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Th...

0 kudos

09-07-2022 5:15:38 AM

3 More Replies

by gazzyjuruj • Contributor II

08-05-2022 6:41:49 AM

12251 Views
2 replies
1 kudos

Error 503 first byte timeout

Hi,I'm receiving error while logging in or signing up on CE.Error 503 first byte timeoutfirst byte timeoutError 54113Details: cache-bom4725-BOM 1659706650 579764974Varnish cache serverscreenshot attached below:- Thanks, any help is appreciated from t...

Data Engineering

12251 Views
2 replies
1 kudos

08-05-2022 6:41:49 AM

View Replies

Latest Reply

Vidula
Databricks Partner

09-07-2022 5:13:56 AM

1 kudos

Hi @Ghazanfar Uruj Does @Prabakar Ammeappin answer help? If it does, would you be happy to mark it as best? If it doesn't, please tell us so we can help you.We'd love to hear from you.Thanks!

1 kudos

09-07-2022 5:13:56 AM

1 More Replies

by Wally • New Contributor II

08-05-2022 1:17:12 AM

2263 Views
2 replies
2 kudos

Databricks Sql schedule button not saving

The schedule button isn't saving my schedule information for a databricks sql query. After I hit save and open the schedule again it has reverted to 'Never'. The query itself according to the past executions pane is not running according to the sched...

Data Engineering

2263 Views
2 replies
2 kudos

08-05-2022 1:17:12 AM

View Replies

Latest Reply

Vidula
Databricks Partner

09-07-2022 5:12:21 AM

2 kudos

Hi @Wally Plourde Does @Rohit Rajendran response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly?We'd love to hear from you.Thanks!

2 kudos

09-07-2022 5:12:21 AM

1 More Replies

Databricks Community

Forum Posts

Is it possible to join two aggregated streams of data?

Resolved! Create and update a csv/json file in ADLSG2 with Eventhub in Databricks streaming

I passed my DE associate exam, but unable to see/download my certificate on credentials.databricks.com. Can someone help download the certificate - this is time sensitive

www.googleapis.com

www.linkedin.com

Column partitioning is not working in delta live table when `columnMapping` table property is enabled.

Recommendations for loading table from two different folder paths using Autoloader and DLT

Resolved! FileNotFoundError when using sftp to write to disk within jobs

Resolved! Ganglia UI not showing visuals

Resolved! Delta Live Table - How to pass OPTION "ignoreChanges" using SQL?

How to deploy a python script with dependencies by dbx?

Bootstrap Timeout during cluster start - Azure Data bricks

Filter data by Date using where condition (< TargetDate) giving "Query returned no results"

Error 503 first byte timeout

Databricks Sql schedule button not saving

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template