cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

avinash_goje
by New Contributor II
  • 3586 Views
  • 2 replies
  • 2 kudos

How to send metrics from GCP Databricks to Grafana Cloud through Prometheus?

While connecting the Databricks and Grafana, I have gone through the following approach.Install Grafna Agent in Databrics Clusters from Databricks console --> Not working since the system is not booted with systemd as init systemSince Spark 3 has Pro...

  • 3586 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 2 kudos

There is a repo with Prometheus gateway https://gist.github.com/Lowess/3a71792d2d09e38bf8f524644bbf8349. In the community, we usually use DataDog as both plays nicely https://docs.datadoghq.com/integrations/databricks/?tabs=driveronly

  • 2 kudos
1 More Replies
MMMM
by New Contributor III
  • 1739 Views
  • 1 replies
  • 0 kudos

missing notebook from workshop

Hi,I was going through this sessionhttps://tinyurl.com/databrickshcarebut on slides there is link to notebook which is broken. can you guys fix and share the link so I could try these notebooks ?this is mentioned in the slides for notebook linkhttps:...

  • 1739 Views
  • 1 replies
  • 0 kudos
thushar
by Databricks Partner
  • 6824 Views
  • 3 replies
  • 2 kudos

Can we use a variable to mention the path in the %run command

To compile the Python scripts in Azure notebooks, we are using the magic command %run.The first parameter for this command is the notebook path, is it possible to mention that path in a variable (we have to construct this path dynamically during the ...

  • 6824 Views
  • 3 replies
  • 2 kudos
Latest Reply
User16752242622
Databricks Employee
  • 2 kudos

@Thushar R​ I don't think it is possible to pass the notebook path in a variable and run it with a %run.I believe you can make use of notebook workflows. Notebook workflows are a complement to %runhttps://docs.databricks.com/notebooks/notebook-workfl...

  • 2 kudos
2 More Replies
Saurav
by New Contributor III
  • 6937 Views
  • 4 replies
  • 7 kudos

spark cluster monitoring and visibility

Hey. I'm working on a project where I'd like to be able to view and play around with the spark cluster metrics. I'd like to know what the utilization % and max values are for metrics like CPU, memory and network. I've tried using some open source sol...

  • 6937 Views
  • 4 replies
  • 7 kudos
Latest Reply
Saurav
New Contributor III
  • 7 kudos

Hey @Kaniz Fatma​, I Appreciate the suggestions and will be looking into them. Haven't gotten to it yet so I didn't want to mention whether they worked for me or not. Since I'm looking to avoid solutions like DataDog, I'll be checking out the Prometh...

  • 7 kudos
3 More Replies
irfanaziz
by Contributor II
  • 2854 Views
  • 2 replies
  • 3 kudos

How to make a string column with numeric and alphabet values use as partition?

So i have two partitions defined for this delta table, One is year('GJHAR') contains year values, and the other is a string column('BUKS') with around 124 unique values. However, there is one problem with the 2nd partition column('BUKS'), The values ...

  • 2854 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

@nafri A​ , So to make sure I understand correctly: if you partition the table with only numeric data in BUKS, new incoming data cannot be added if it contains a string; but the other way around it does work?Could it be that spark has inferred the co...

  • 3 kudos
1 More Replies
marta_cdc
by New Contributor
  • 3812 Views
  • 2 replies
  • 0 kudos

Automate in code the launching of a sql script

Do you know how to automate in code the launching of a sql script? Currently I do it by selection. 

image
  • 3812 Views
  • 2 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

@Marta Vicente Sánchez​, what tool are you using here? And are we talking about Databricks SQL?

  • 0 kudos
1 More Replies
Ramya
by New Contributor III
  • 22815 Views
  • 4 replies
  • 3 kudos

Resolved! Databricks Rest API

Hi, I am having an issue accessing data bricks API 2.0/workspace/mkdirs through python. I am using the below azure method to generate the access token. I am not sure why I am getting 404 any suggestions?token_credential = DefaultAzureCredential()sc...

  • 22815 Views
  • 4 replies
  • 3 kudos
Latest Reply
Ramya
New Contributor III
  • 3 kudos

Yes that is correct!. It worked. Thanks

  • 3 kudos
3 More Replies
Dineshkumar_Raj
by New Contributor
  • 3985 Views
  • 2 replies
  • 1 kudos

why the job running time and command execution time not matching in databricks

I have a azure databricks job and it's triggered via ADF using a API call. I want see why the job has been taking n minutes to complete the tasks. When the job execution results, The job execution time says 15 mins and the individual cells/commands d...

  • 3985 Views
  • 2 replies
  • 1 kudos
Latest Reply
Anonymous
Not applicable
  • 1 kudos

Hey there @DineshKumar​ Does @Prabakar Ammeappin​'s response answer your question? If yes, would you be happy to mark it as best so that other members can find the solution more quickly? Else please let us know if you need more help. Cheers!

  • 1 kudos
1 More Replies
abaschkim
by New Contributor II
  • 4361 Views
  • 4 replies
  • 0 kudos

Delta Lake table: large volume due to versioning

I have set up a Spark standalone cluster and use Spark Structured Streaming to write data from Kafka to multiple Delta Lake tables - simply stored in the file system. So there are multiple writes per second. After running the pipeline for a while, I ...

  • 4361 Views
  • 4 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hey there @Kim Abasch​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you....

  • 0 kudos
3 More Replies
KumarShiv
by New Contributor III
  • 7872 Views
  • 5 replies
  • 11 kudos

Resolved! Databricks Issue:- assertion failed: Invalid shuffle partition specs:

I hv a complex script which consuming more then 100GB data and have some aggregation on it and in the end I am simply try simply write/display data from Data frame. Then i am getting issue (assertion failed: Invalid shuffle partition specs: ).Pls hel...

DB_Issue
  • 7872 Views
  • 5 replies
  • 11 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 11 kudos

Please use display(df_FinalAction)Spark is lazy evaluated but "display" not, so you can debug by displaying each dataframe at the end of each cell.

  • 11 kudos
4 More Replies
Constantine
by Contributor III
  • 3179 Views
  • 2 replies
  • 3 kudos

Error when writing dataframe to s3 location using PySpark

I get an error when writing dataframe to s3 location Found invalid character(s) among " ,;{}()\n\t=" in the column names of yourI have gone through all the columns and none of them have any special characters. Any idea how to fix this?

  • 3179 Views
  • 2 replies
  • 3 kudos
Latest Reply
Emilie
New Contributor II
  • 3 kudos

I got this error when I was running a query given to me, and the author didn't have aliases on aggregates. Something like:sum(dollars_spent)needed an alias:sum(dollars_spent) as sum_dollars_spent

  • 3 kudos
1 More Replies
Reza
by New Contributor III
  • 3002 Views
  • 2 replies
  • 1 kudos

Can we order the widgets in Databricks?

I am trying to order the way that widgets are shown in Databricks, but I cannot. For example, I have two text widgets (start date and end date). Databricks shows "end_date" before "start_date" on top, as the default order is alphabetical. Obviously, ...

  • 3002 Views
  • 2 replies
  • 1 kudos
Latest Reply
Prabakar
Databricks Employee
  • 1 kudos

Hi @Reza Rajabi​ this is a known thing and we have a feature request to fix this. I do not have an ETA on when this feature will be available. So for now to avoid the widgets being in alphabetical order, you need to use the prefix like 1,2,3.. or A,B...

  • 1 kudos
1 More Replies
blakedwb
by New Contributor III
  • 7120 Views
  • 2 replies
  • 1 kudos

Resolved! How to Incorporate Historical Data in Delta Live Pipeline?

Now that delta live pipeline is GA we are looking to convert our existing processes to leverage it. One thing that remains unclear is how to populate new delta live tables with historical data? Currently we are looking to use CDC by leveraging create...

  • 7120 Views
  • 2 replies
  • 1 kudos
Latest Reply
blakedwb
New Contributor III
  • 1 kudos

@Kaniz Fatma​ Hello, sorry for the delayed response. The guide does not answer how to incorporate existing delta tables that container historical data into a delta live pipeline. We ended up changing the source data to pull from the existing bronze t...

  • 1 kudos
1 More Replies
Labels