cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jayallenmn
by New Contributor III
  • 3015 Views
  • 4 replies
  • 3 kudos

Resolved! Couple of Delta Lake questions

Hey guys,We're considering Delta Lake as the storage for our project and have a couple questions. The first one is what's the pricing for Delta Lake - can't seem to find a page that says x amount costs y.The second question is more technical - if we...

  • 3015 Views
  • 4 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

delta lake itself is free. It is a file format. But you will have to pay for storage and compute of course.If you want to use Databricks with delta lake, it will not be free unless you use the community edition.Depending on what you are planning to...

  • 3 kudos
3 More Replies
Daps022
by New Contributor
  • 2420 Views
  • 3 replies
  • 1 kudos
  • 2420 Views
  • 3 replies
  • 1 kudos
Latest Reply
Rheiman
Contributor II
  • 1 kudos

Try looking into the structured streaming API. There you will learn about how to join streams and static data, how to set triggers for the streams, minibatching and other things that are important to the reliability of your application.Structured Str...

  • 1 kudos
2 More Replies
Yagao
by New Contributor
  • 5161 Views
  • 5 replies
  • 2 kudos

How to do python within sql query in Databricks ?

Can anyone show me one use case how to do python within sql query ?

  • 5161 Views
  • 5 replies
  • 2 kudos
Latest Reply
tomasz
Databricks Employee
  • 2 kudos

To run Python within a SQL query you have to first define a Python function and then register it as a UDF. Once that is done you are able to call that UDF within a SQL query. Please take a look at this documentation here:https://docs.databricks.com/s...

  • 2 kudos
4 More Replies
Komal7
by New Contributor
  • 1169 Views
  • 2 replies
  • 0 kudos
  • 1169 Views
  • 2 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Hi @Komal Gyanani​,AQE was a major improvement added to Spark 3.0. It was added since Databricks runtime 7.3 LT (Spark 3.0) https://docs.databricks.com/release-notes/runtime/releases.html and here is docs on AQE https://docs.databricks.com/spark/late...

  • 0 kudos
1 More Replies
Marcosan
by New Contributor II
  • 1607 Views
  • 2 replies
  • 3 kudos

What’s the best way to pass dependency versions dynamically to a cluster

I am using init scripts and would like to be able to control the version of a component that we release internally and frequently. We are now manually updating a dbfs requirement.txt file but I think that this problem may have been encountered befor...

  • 1607 Views
  • 2 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

You can programmatically create cluster templates in JSON files and include config JSON files with libraries needed. Cluster deployment in that scenario needs to be controlled via API https://docs.databricks.com/dev-tools/api/latest/clusters.html

  • 3 kudos
1 More Replies
avinash_goje
by New Contributor II
  • 2962 Views
  • 2 replies
  • 2 kudos

How to send metrics from GCP Databricks to Grafana Cloud through Prometheus?

While connecting the Databricks and Grafana, I have gone through the following approach.Install Grafna Agent in Databrics Clusters from Databricks console --> Not working since the system is not booted with systemd as init systemSince Spark 3 has Pro...

  • 2962 Views
  • 2 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

There is a repo with Prometheus gateway https://gist.github.com/Lowess/3a71792d2d09e38bf8f524644bbf8349. In the community, we usually use DataDog as both plays nicely https://docs.datadoghq.com/integrations/databricks/?tabs=driveronly

  • 2 kudos
1 More Replies
MMMM
by New Contributor III
  • 1347 Views
  • 1 replies
  • 0 kudos

missing notebook from workshop

Hi,I was going through this sessionhttps://tinyurl.com/databrickshcarebut on slides there is link to notebook which is broken. can you guys fix and share the link so I could try these notebooks ?this is mentioned in the slides for notebook linkhttps:...

  • 1347 Views
  • 1 replies
  • 0 kudos
thushar
by Contributor
  • 5510 Views
  • 3 replies
  • 2 kudos

Can we use a variable to mention the path in the %run command

To compile the Python scripts in Azure notebooks, we are using the magic command %run.The first parameter for this command is the notebook path, is it possible to mention that path in a variable (we have to construct this path dynamically during the ...

  • 5510 Views
  • 3 replies
  • 2 kudos
Latest Reply
User16752242622
Valued Contributor
  • 2 kudos

@Thushar R​ I don't think it is possible to pass the notebook path in a variable and run it with a %run.I believe you can make use of notebook workflows. Notebook workflows are a complement to %runhttps://docs.databricks.com/notebooks/notebook-workfl...

  • 2 kudos
2 More Replies
Saurav
by New Contributor III
  • 5706 Views
  • 4 replies
  • 7 kudos

spark cluster monitoring and visibility

Hey. I'm working on a project where I'd like to be able to view and play around with the spark cluster metrics. I'd like to know what the utilization % and max values are for metrics like CPU, memory and network. I've tried using some open source sol...

  • 5706 Views
  • 4 replies
  • 7 kudos
Latest Reply
Saurav
New Contributor III
  • 7 kudos

Hey @Kaniz Fatma​, I Appreciate the suggestions and will be looking into them. Haven't gotten to it yet so I didn't want to mention whether they worked for me or not. Since I'm looking to avoid solutions like DataDog, I'll be checking out the Prometh...

  • 7 kudos
3 More Replies
irfanaziz
by Contributor II
  • 2227 Views
  • 2 replies
  • 3 kudos

How to make a string column with numeric and alphabet values use as partition?

So i have two partitions defined for this delta table, One is year('GJHAR') contains year values, and the other is a string column('BUKS') with around 124 unique values. However, there is one problem with the 2nd partition column('BUKS'), The values ...

  • 2227 Views
  • 2 replies
  • 3 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 3 kudos

@nafri A​ , So to make sure I understand correctly: if you partition the table with only numeric data in BUKS, new incoming data cannot be added if it contains a string; but the other way around it does work?Could it be that spark has inferred the co...

  • 3 kudos
1 More Replies
marta_cdc
by New Contributor
  • 3222 Views
  • 2 replies
  • 0 kudos

Automate in code the launching of a sql script

Do you know how to automate in code the launching of a sql script? Currently I do it by selection. 

image
  • 3222 Views
  • 2 replies
  • 0 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 0 kudos

@Marta Vicente Sánchez​, what tool are you using here? And are we talking about Databricks SQL?

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels