Data Engineering

Forum Posts

Sorted by:

by avinash_goje • New Contributor II

06-26-2022 7:04:51 PM

3586 Views
2 replies
2 kudos

How to send metrics from GCP Databricks to Grafana Cloud through Prometheus?

While connecting the Databricks and Grafana, I have gone through the following approach.Install Grafna Agent in Databrics Clusters from Databricks console --> Not working since the system is not booted with systemd as init systemSince Spark 3 has Pro...

Data Engineering

3586 Views
2 replies
2 kudos

06-26-2022 7:04:51 PM

View Replies

Latest Reply

Hubert-Dudek
Databricks MVP

06-27-2022 7:32:09 AM

2 kudos

There is a repo with Prometheus gateway https://gist.github.com/Lowess/3a71792d2d09e38bf8f524644bbf8349. In the community, we usually use DataDog as both plays nicely https://docs.datadoghq.com/integrations/databricks/?tabs=driveronly

2 kudos

06-27-2022 7:32:09 AM

1 More Replies

by MMMM • New Contributor III

06-21-2022 1:09:32 AM

1739 Views
1 replies
0 kudos

missing notebook from workshop

Hi,I was going through this sessionhttps://tinyurl.com/databrickshcarebut on slides there is link to notebook which is broken. can you guys fix and share the link so I could try these notebooks ?this is mentioned in the slides for notebook linkhttps:...

Data Engineering

1739 Views
1 replies
0 kudos

06-21-2022 1:09:32 AM

View Replies

by thushar • Databricks Partner

06-20-2022 11:04:50 PM

6824 Views
3 replies
2 kudos

Can we use a variable to mention the path in the %run command

To compile the Python scripts in Azure notebooks, we are using the magic command %run.The first parameter for this command is the notebook path, is it possible to mention that path in a variable (we have to construct this path dynamically during the ...

Data Engineering

6824 Views
3 replies
2 kudos

06-20-2022 11:04:50 PM

View Replies

Latest Reply

User16752242622
Databricks Employee

06-21-2022 2:14:05 AM

2 kudos

@Thushar R I don't think it is possible to pass the notebook path in a variable and run it with a %run.I believe you can make use of notebook workflows. Notebook workflows are a complement to %runhttps://docs.databricks.com/notebooks/notebook-workfl...

2 kudos

06-21-2022 2:14:05 AM

2 More Replies

by Saurav • New Contributor III

06-16-2022 12:33:18 AM

6937 Views
4 replies
7 kudos

spark cluster monitoring and visibility

Hey. I'm working on a project where I'd like to be able to view and play around with the spark cluster metrics. I'd like to know what the utilization % and max values are for metrics like CPU, memory and network. I've tried using some open source sol...

Data Engineering

6937 Views
4 replies
7 kudos

06-16-2022 12:33:18 AM

View Replies

Latest Reply

Saurav
New Contributor III

06-22-2022 2:38:03 AM

7 kudos

Hey @Kaniz Fatma, I Appreciate the suggestions and will be looking into them. Haven't gotten to it yet so I didn't want to mention whether they worked for me or not. Since I'm looking to avoid solutions like DataDog, I'll be checking out the Prometh...

7 kudos

06-22-2022 2:38:03 AM

3 More Replies

by irfanaziz • Contributor II

06-15-2022 12:44:46 AM

2854 Views
2 replies
3 kudos

How to make a string column with numeric and alphabet values use as partition?

So i have two partitions defined for this delta table, One is year('GJHAR') contains year values, and the other is a string column('BUKS') with around 124 unique values. However, there is one problem with the 2nd partition column('BUKS'), The values ...

Data Engineering

2854 Views
2 replies
3 kudos

06-15-2022 12:44:46 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

06-17-2022 2:08:12 AM

3 kudos

@nafri A , So to make sure I understand correctly: if you partition the table with only numeric data in BUKS, new incoming data cannot be added if it contains a string; but the other way around it does work?Could it be that spark has inferred the co...

3 kudos

06-17-2022 2:08:12 AM

1 More Replies

by marta_cdc • New Contributor

06-08-2022 2:47:42 AM

3812 Views
2 replies
0 kudos

Automate in code the launching of a sql script

Do you know how to automate in code the launching of a sql script? Currently I do it by selection.

Data Engineering

3812 Views
2 replies
0 kudos

06-08-2022 2:47:42 AM

View Replies

Latest Reply

-werners-
Esteemed Contributor III

06-09-2022 6:23:18 AM

0 kudos

@Marta Vicente Sánchez, what tool are you using here? And are we talking about Databricks SQL?

0 kudos

06-09-2022 6:23:18 AM

1 More Replies

by Ramya • New Contributor III

06-07-2022 9:31:55 AM

22815 Views
4 replies
3 kudos

Resolved! Databricks Rest API

Hi, I am having an issue accessing data bricks API 2.0/workspace/mkdirs through python. I am using the below azure method to generate the access token. I am not sure why I am getting 404 any suggestions?token_credential = DefaultAzureCredential()sc...

Data Engineering

22815 Views
4 replies
3 kudos

06-07-2022 9:31:55 AM

View Replies

Latest Reply

Ramya
New Contributor III

06-13-2022 3:19:58 PM

3 kudos

Yes that is correct!. It worked. Thanks

3 kudos

06-13-2022 3:19:58 PM

3 More Replies

by Dineshkumar_Raj • New Contributor

05-30-2022 10:25:16 PM

3985 Views
2 replies
1 kudos

why the job running time and command execution time not matching in databricks

I have a azure databricks job and it's triggered via ADF using a API call. I want see why the job has been taking n minutes to complete the tasks. When the job execution results, The job execution time says 15 mins and the individual cells/commands d...

Data Engineering

3985 Views
2 replies
1 kudos

05-30-2022 10:25:16 PM

View Replies

Latest Reply

Anonymous
Not applicable

07-29-2022 9:49:02 AM

1 kudos

Hey there @DineshKumar Does @Prabakar Ammeappin's response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly? Else please let us know if you need more help. Cheers!

1 kudos

07-29-2022 9:49:02 AM

1 More Replies

by abaschkim • New Contributor II

05-30-2022 7:31:29 AM

4361 Views
4 replies
0 kudos

Delta Lake table: large volume due to versioning

I have set up a Spark standalone cluster and use Spark Structured Streaming to write data from Kafka to multiple Delta Lake tables - simply stored in the file system. So there are multiple writes per second. After running the pipeline for a while, I ...

Data Engineering

4361 Views
4 replies
0 kudos

05-30-2022 7:31:29 AM

View Replies

Latest Reply

Anonymous
Not applicable

07-29-2022 9:38:25 AM

0 kudos

Hey there @Kim Abasch Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you....

0 kudos

07-29-2022 9:38:25 AM

3 More Replies

by KumarShiv • New Contributor III

07-27-2022 12:41:49 AM

7872 Views
5 replies
11 kudos

Resolved! Databricks Issue:- assertion failed: Invalid shuffle partition specs:

I hv a complex script which consuming more then 100GB data and have some aggregation on it and in the end I am simply try simply write/display data from Data frame. Then i am getting issue (assertion failed: Invalid shuffle partition specs: ).Pls hel...

Data Engineering

7872 Views
5 replies
11 kudos

07-27-2022 12:41:49 AM

View Replies

Latest Reply

Hubert-Dudek
Databricks MVP

07-27-2022 6:10:11 AM

11 kudos

Please use display(df_FinalAction)Spark is lazy evaluated but "display" not, so you can debug by displaying each dataframe at the end of each cell.

11 kudos

07-27-2022 6:10:11 AM

4 More Replies

by dumpstech • New Contributor II

07-28-2022 10:09:23 PM

1733 Views
0 replies
1 kudos

Dumpstech is the best platform, they provide best practice exam questions pdf, easy way to pass your exam in first attempt

Data Engineering

1733 Views
0 replies
1 kudos

07-28-2022 10:09:23 PM

by Constantine • Contributor III

06-02-2022 3:18:34 PM

3179 Views
2 replies
3 kudos

Error when writing dataframe to s3 location using PySpark

I get an error when writing dataframe to s3 location Found invalid character(s) among " ,;{}()\n\t=" in the column names of yourI have gone through all the columns and none of them have any special characters. Any idea how to fix this?

Data Engineering

3179 Views
2 replies
3 kudos

06-02-2022 3:18:34 PM

View Replies

Latest Reply

Emilie
New Contributor II

06-07-2022 6:23:28 PM

3 kudos

I got this error when I was running a query given to me, and the author didn't have aliases on aggregates. Something like:sum(dollars_spent)needed an alias:sum(dollars_spent) as sum_dollars_spent

3 kudos

06-07-2022 6:23:28 PM

1 More Replies

by Joe_C • New Contributor

06-02-2022 10:34:03 AM

1792 Views
1 replies
0 kudos

From what I'm seeing Databricks doesn't have DECLARE function, how can I ... ?

How can I re-write this statement in a way that is compatible for Databricks?DECLARE @DATE_BEGIN_TEST AS DATE = DATEADD(DAY, - 60, GETDATE());DECLARE @DATE_END_TEST AS DATE = GETDATE();

Data Engineering

1792 Views
1 replies
0 kudos

06-02-2022 10:34:03 AM

View Replies

by Reza • New Contributor III

06-02-2022 8:25:32 AM

3002 Views
2 replies
1 kudos

Can we order the widgets in Databricks?

I am trying to order the way that widgets are shown in Databricks, but I cannot. For example, I have two text widgets (start date and end date). Databricks shows "end_date" before "start_date" on top, as the default order is alphabetical. Obviously, ...

Data Engineering

3002 Views
2 replies
1 kudos

06-02-2022 8:25:32 AM

View Replies

Latest Reply

Prabakar
Databricks Employee

06-02-2022 8:35:04 AM

1 kudos

Hi @Reza Rajabi this is a known thing and we have a feature request to fix this. I do not have an ETA on when this feature will be available. So for now to avoid the widgets being in alphabetical order, you need to use the prefix like 1,2,3.. or A,B...

1 kudos

06-02-2022 8:35:04 AM

1 More Replies

by blakedwb • New Contributor III

06-01-2022 11:23:57 AM

7120 Views
2 replies
1 kudos

Resolved! How to Incorporate Historical Data in Delta Live Pipeline?

Now that delta live pipeline is GA we are looking to convert our existing processes to leverage it. One thing that remains unclear is how to populate new delta live tables with historical data? Currently we are looking to use CDC by leveraging create...

Data Engineering

7120 Views
2 replies
1 kudos

06-01-2022 11:23:57 AM

View Replies

Latest Reply

blakedwb
New Contributor III

06-15-2022 6:53:04 AM

1 kudos

@Kaniz Fatma Hello, sorry for the delayed response. The guide does not answer how to incorporate existing delta tables that container historical data into a delta live pipeline. We ended up changing the source data to pull from the existing bronze t...

1 kudos

06-15-2022 6:53:04 AM

1 More Replies

Databricks Community

Forum Posts

How to send metrics from GCP Databricks to Grafana Cloud through Prometheus?

missing notebook from workshop

Can we use a variable to mention the path in the %run command

spark cluster monitoring and visibility

How to make a string column with numeric and alphabet values use as partition?

Automate in code the launching of a sql script

Resolved! Databricks Rest API

why the job running time and command execution time not matching in databricks

Delta Lake table: large volume due to versioning

Resolved! Databricks Issue:- assertion failed: Invalid shuffle partition specs:

Dumpstech is the best platform, they provide best practice exam questions pdf, easy way to pass your exam in first attempt

Error when writing dataframe to s3 location using PySpark

From what I'm seeing Databricks doesn't have DECLARE function, how can I ... ?

Can we order the widgets in Databricks?

Resolved! How to Incorporate Historical Data in Delta Live Pipeline?

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template