cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

129876
by New Contributor III
  • 3068 Views
  • 6 replies
  • 10 kudos

Schedule job runs with different parameters?

Is it possible to schedule different runs for job with parameters? I have a notebook that generates data based on the supplied parameter but would like to schedule runs instead of manually starting them. I assume that this would be possible using the...

  • 3068 Views
  • 6 replies
  • 10 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 10 kudos

You can pass parameters for your task. Each task type has different requirements for formatting and passing the parameters. https://docs.databricks.com/workflows/jobs/jobs.html#create-a-jobREST API can also pass parameters fro jobs. Tokens replace pa...

  • 10 kudos
5 More Replies
Sandy21
by New Contributor III
  • 1155 Views
  • 1 replies
  • 2 kudos

Resolved! Cluster Configuration Best Practices

I have a cluster with the configuration of 400 GB RAM, 160 Cores.Which of the following would be the ideal configuration to use in case of one or more VM failures?Cluster A: Total RAM 400 GB      Total Cores 160   Total VMs: 1   400 GB/Exec & 160 c...

  • 1155 Views
  • 1 replies
  • 2 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 2 kudos

@Santhosh Raj​ can you please confirm cluster sizes you are taking are related to driver and worker node. how much you want to allocate to Driver and Worker? once we are sure about type of driver and worker we would like to pick, we need to enable au...

  • 2 kudos
lzha174
by Contributor
  • 2791 Views
  • 3 replies
  • 3 kudos

Resolved! ipywidgets stopped displaying today

everything was working yesterday, but today it stopped working as below: The example from the DB website does not work either with the same error. The page source says  This is affecting my work~~~a bit annoying, is DB people going to look into this ...

image image
  • 2791 Views
  • 3 replies
  • 3 kudos
Latest Reply
lzha174
Contributor
  • 3 kudos

Today its back to work! I got a pop up window sayingthis should be the reason it was broken

  • 3 kudos
2 More Replies
ossinova
by Contributor II
  • 1100 Views
  • 2 replies
  • 3 kudos

Resolved! Shortcut for changing cell language (adding magic command)

Is there an option or shortcut (or is it in the pipeline) to quickly change / insert a cell to a specific language in Databricks?Triggering B + P would for instance add a new cell below with %pythonTriggering M would for instance change that cell to ...

  • 1100 Views
  • 2 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @Oscar Dyremyhr​  ​, It would mean a lot if you could select the "Best Answer" to help others find the correct answer faster.This makes that answer appear right after the question, so it's easier to find within a thread.It also helps us mark the q...

  • 3 kudos
1 More Replies
Sajid1
by Contributor
  • 20473 Views
  • 4 replies
  • 5 kudos

Resolved! Parse Syntax error ,can anyone guide me what is going wrong here

Select case WHEN {{ Month }} = 0 then add_months(current_date(),-13 ) elseWHEN {{ Month }}> month(add_months(current_date(),-1)) then add_months(to_date(concat(year(current_date())-1,'-',{{Month}},'-',1)),-13)             else add_months(to_date(conc...

  • 20473 Views
  • 4 replies
  • 5 kudos
Latest Reply
Kaniz
Community Manager
  • 5 kudos

Hi @Sajid Thavalengal Rahiman​ ​, It would mean a lot if you could select the "Best Answer" to help others find the correct answer faster.This makes that answer appear right after the question, so it's easier to find within a thread.It also helps us ...

  • 5 kudos
3 More Replies
Constantino
by New Contributor III
  • 1134 Views
  • 2 replies
  • 2 kudos

CMK for managed services automatic rotation

The docs for the CMK for workspace storage states:After you add a customer-managed key for storage, you cannot later rotate the key by setting a different key ARN for the workspace. However, AWS provides automatic CMK master key rotation, which rotat...

  • 1134 Views
  • 2 replies
  • 2 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 2 kudos

Hi @Constantino Schillebeeckx​ , You can update/rotate CMK at a later time (on a running workspace). Please refer: https://docs.databricks.com/security/keys/customer-managed-keys-managed-services-aws.html?_ga=2.214562071.1895504292.1667411694-6435253...

  • 2 kudos
1 More Replies
BorislavBlagoev
by Valued Contributor III
  • 22256 Views
  • 36 replies
  • 15 kudos
  • 22256 Views
  • 36 replies
  • 15 kudos
Latest Reply
bhuvahh
New Contributor II
  • 15 kudos

I think plain python code will run with databricks connect (if it is a python program you are writing), and spark sql can be done by spark.sql(...).

  • 15 kudos
35 More Replies
jgsp
by New Contributor II
  • 1249 Views
  • 2 replies
  • 1 kudos

Can't import st_constructors module after installing Apache Sedona

Hi there,I've recently installed Apache Sedona on my cluster, according to the detailed instructions here. My Databricks runtime version is 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12).The installation included the apache-sedona library from PyP...

  • 1249 Views
  • 2 replies
  • 1 kudos
Latest Reply
jgsp
New Contributor II
  • 1 kudos

Thank you @Debayan Mukherjee​ for the prompt reply. I've followed the instructions carefully, but now every time I try to run a cell in my notebook I get a "Cancelled" message. It clearly didn't work. Any advice?Your help is much appreciated.

  • 1 kudos
1 More Replies
HQJaTu
by New Contributor III
  • 1358 Views
  • 2 replies
  • 2 kudos

Custom container doesn't launch systemd

Quite soon after moving from VMs to containers, I started crafting my own images. That way notebooks have all the necessary libraries already there and no need to do any Pipping/installing in the notebook.As requirements get more complex, now I'm at ...

  • 1358 Views
  • 2 replies
  • 2 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 2 kudos

Hi @Jari Turkia​ , Please check if this helps: https://developers.redhat.com/blog/2019/04/24/how-to-run-systemd-in-a-container#other_cool_features_about_podman_and_systemdAlso, you can run ubuntu /redhat linux OS inside containers which will have sys...

  • 2 kudos
1 More Replies
Sandy21
by New Contributor III
  • 5514 Views
  • 2 replies
  • 5 kudos

Schema Evolution Issue in Streaming

When there is a schema change while reading and writing to a stream, will the schema changes be automatically handled by sparkor do we need to include the option(mergeschema=True)?Eg:df.writeStream .option("mergeSchema", "true") .format("delta") .out...

  • 5514 Views
  • 2 replies
  • 5 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 5 kudos

mergeSchema doesn't support all operations. In some cases .option("overwriteSchema", "true") is needed. MergeSchema doesn't support:Dropping a columnChanging an existing column's data type (in place)Renaming column names that differ only by case (e.g...

  • 5 kudos
1 More Replies
ncouture
by Contributor
  • 967 Views
  • 1 replies
  • 0 kudos

Resolved! How do you run a query as the owner but use a parameter as a viewer

I have a query that is hitting a table I have access too. Granting access to everyone is not an option. I am using this query in a SQL Dashboard. One of the where clause conditions uses a parameter populated by another query. I want this parameter qu...

  • 967 Views
  • 1 replies
  • 0 kudos
Latest Reply
ncouture
Contributor
  • 0 kudos

It is not possible to do what I want. Somewhat seems like a security flaw but what ever

  • 0 kudos
cmilligan
by Contributor II
  • 1922 Views
  • 3 replies
  • 2 kudos

Resolved! Orchestrate run of a folder

I'm needing to run the contents of a folder, which can change over time. Is there a way to set up a notebook that can orchestrate running all notebooks in a folder? My though was if I could retrieve a list of the notebooks I could create a loop to ru...

  • 1922 Views
  • 3 replies
  • 2 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 2 kudos

List all notebooks by making API call and then run them by using dbutils.notebook.run:import requests ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext() host_name = ctx.tags().get("browserHostName").get() host_token = ctx.apiToke...

  • 2 kudos
2 More Replies
al_joe
by Contributor
  • 2890 Views
  • 6 replies
  • 5 kudos

Resolved! How do I clone a repo in Community Edition?

The e-learning videos on DBacademy say we should click on "Repos" and "Add Repo"I cannot find this in my Community Edition UII am a little frustrated that there are so many different versions of the UI and many videos show UI options that we cannot ...

  • 2890 Views
  • 6 replies
  • 5 kudos
Latest Reply
Psybelo
New Contributor II
  • 5 kudos

Hello, just import the .dbc file direct into your user workspace, as explained by Databricks here:https://www.databricks.training/step-by-step/importing-courseware-from-github/The simplest way

  • 5 kudos
5 More Replies
Gopi0403
by New Contributor III
  • 2602 Views
  • 7 replies
  • 0 kudos

Issue on Cluster creating new workspace: I Cannot able to create a new workspace in Databricks using Quickstart. When I am creating the workspace I ge...

Issue on Cluster creating new workspace: I Cannot able to create a new workspace in Databricks using Quickstart. When I am creating the workspace I get the Rollback failed error from AWS eventhoughI have given all the required informations. Kindly he...

  • 2602 Views
  • 7 replies
  • 0 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 0 kudos

hi @Gopichandran N​ could you please add more information on the issue that you are facing. could you please add the screenshot of the error?

  • 0 kudos
6 More Replies
-werners-
by Esteemed Contributor III
  • 1947 Views
  • 2 replies
  • 17 kudos

Autoloader: how to avoid overlap in files

I'm thinking of using autoloader to process files being put on our data lake.Let's say f.e. every 15 minutes, a parquet files is written. These files however contain overlapping data.Now, every 2 hours I want to process the new data (autoloader) and...

  • 1947 Views
  • 2 replies
  • 17 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 17 kudos

What about forEachBatch and then MERGE?Alternatively, run another process that will clean updates using the window function, as you said.

  • 17 kudos
1 More Replies
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!

Labels