cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

RK_AV
by New Contributor III
  • 6195 Views
  • 5 replies
  • 8 kudos

Autoloader cluster

I wanted to setup Autoloader to process files from Azure Data Lake (Blob) automatically whenever new files arrive. For this to work, I wanted to know if AutoLoader requires that the cluster is on all the time.

  • 6195 Views
  • 5 replies
  • 8 kudos
Latest Reply
asif5494
New Contributor III
  • 8 kudos

@Kaniz Fatma​ , If my cluster is not active, and I have uploaded 50 files in storage location, then where this Auto Loader will list out these 50 files. Will it use any checkpoint location, if yes, then how can I set the checkpoint location in Cloud ...

  • 8 kudos
4 More Replies
133994
by New Contributor III
  • 3327 Views
  • 6 replies
  • 4 kudos

Resolved! Databricks Certified Data Engineer Associate Certificate not received

Hello,I passed Databricks Certified Data Engineer Associate on 30 October 2022, but still didn't receive my certificate/badge. Could you please help me to obtain it?Regards,Ali

  • 3327 Views
  • 6 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @ali.ganbarov ali.ganbarov​ We are really sorry for the delays.The certification has been issued but due to the lag in the system, it is taking time. Our team is working on it. Please visit the credible site once.Thanks and Regards

  • 4 kudos
5 More Replies
129876
by New Contributor III
  • 6916 Views
  • 4 replies
  • 7 kudos

Schedule job runs with different parameters?

Is it possible to schedule different runs for job with parameters? I have a notebook that generates data based on the supplied parameter but would like to schedule runs instead of manually starting them. I assume that this would be possible using the...

  • 6916 Views
  • 4 replies
  • 7 kudos
Latest Reply
Debayan
Databricks Employee
  • 7 kudos

You can pass parameters for your task. Each task type has different requirements for formatting and passing the parameters. https://docs.databricks.com/workflows/jobs/jobs.html#create-a-jobREST API can also pass parameters fro jobs. Tokens replace pa...

  • 7 kudos
3 More Replies
Sandy21
by New Contributor III
  • 3353 Views
  • 1 replies
  • 2 kudos

Resolved! Cluster Configuration Best Practices

I have a cluster with the configuration of 400 GB RAM, 160 Cores.Which of the following would be the ideal configuration to use in case of one or more VM failures?Cluster A: Total RAM 400 GB      Total Cores 160   Total VMs: 1   400 GB/Exec & 160 c...

  • 3353 Views
  • 1 replies
  • 2 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 2 kudos

@Santhosh Raj​ can you please confirm cluster sizes you are taking are related to driver and worker node. how much you want to allocate to Driver and Worker? once we are sure about type of driver and worker we would like to pick, we need to enable au...

  • 2 kudos
lzha174
by Contributor
  • 8152 Views
  • 3 replies
  • 3 kudos

Resolved! ipywidgets stopped displaying today

everything was working yesterday, but today it stopped working as below: The example from the DB website does not work either with the same error. The page source says  This is affecting my work~~~a bit annoying, is DB people going to look into this ...

image image
  • 8152 Views
  • 3 replies
  • 3 kudos
Latest Reply
lzha174
Contributor
  • 3 kudos

Today its back to work! I got a pop up window sayingthis should be the reason it was broken

  • 3 kudos
2 More Replies
Dinu2
by New Contributor III
  • 3430 Views
  • 0 replies
  • 3 kudos

Cassandra connection from ADB

Hi , Could anyone please help to know the steps for connecting Cassandra from Azure Databricks? I have followed the steps in https://learn.microsoft.com/en-us/azure/databricks/_static/notebooks/azure/cassandra-azure.html But I am getting below error....

  • 3430 Views
  • 0 replies
  • 3 kudos
Sandy21
by New Contributor III
  • 1480 Views
  • 0 replies
  • 0 kudos

Databricks Jobs

Consider User A has deployed the job to prod. User B has scheduled the job through an external orchestration tool.User C has got the owner privileges from User A. Whose email id would be displayed while running the databricks job?

  • 1480 Views
  • 0 replies
  • 0 kudos
Constantino
by New Contributor III
  • 3103 Views
  • 2 replies
  • 2 kudos

CMK for managed services automatic rotation

The docs for the CMK for workspace storage states:After you add a customer-managed key for storage, you cannot later rotate the key by setting a different key ARN for the workspace. However, AWS provides automatic CMK master key rotation, which rotat...

  • 3103 Views
  • 2 replies
  • 2 kudos
Latest Reply
Debayan
Databricks Employee
  • 2 kudos

Hi @Constantino Schillebeeckx​ , You can update/rotate CMK at a later time (on a running workspace). Please refer: https://docs.databricks.com/security/keys/customer-managed-keys-managed-services-aws.html?_ga=2.214562071.1895504292.1667411694-6435253...

  • 2 kudos
1 More Replies
BorislavBlagoev
by Valued Contributor III
  • 38473 Views
  • 33 replies
  • 14 kudos
  • 38473 Views
  • 33 replies
  • 14 kudos
Latest Reply
bhuvahh
New Contributor II
  • 14 kudos

I think plain python code will run with databricks connect (if it is a python program you are writing), and spark sql can be done by spark.sql(...).

  • 14 kudos
32 More Replies
jgsp
by New Contributor II
  • 3153 Views
  • 2 replies
  • 1 kudos

Can't import st_constructors module after installing Apache Sedona

Hi there,I've recently installed Apache Sedona on my cluster, according to the detailed instructions here. My Databricks runtime version is 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12).The installation included the apache-sedona library from PyP...

  • 3153 Views
  • 2 replies
  • 1 kudos
Latest Reply
jgsp
New Contributor II
  • 1 kudos

Thank you @Debayan Mukherjee​ for the prompt reply. I've followed the instructions carefully, but now every time I try to run a cell in my notebook I get a "Cancelled" message. It clearly didn't work. Any advice?Your help is much appreciated.

  • 1 kudos
1 More Replies
Sandy21
by New Contributor III
  • 13396 Views
  • 2 replies
  • 6 kudos

Schema Evolution Issue in Streaming

When there is a schema change while reading and writing to a stream, will the schema changes be automatically handled by sparkor do we need to include the option(mergeschema=True)?Eg:df.writeStream .option("mergeSchema", "true") .format("delta") .out...

  • 13396 Views
  • 2 replies
  • 6 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 6 kudos

mergeSchema doesn't support all operations. In some cases .option("overwriteSchema", "true") is needed. MergeSchema doesn't support:Dropping a columnChanging an existing column's data type (in place)Renaming column names that differ only by case (e.g...

  • 6 kudos
1 More Replies
Sajid1
by Contributor
  • 40386 Views
  • 3 replies
  • 5 kudos

Resolved! Parse Syntax error ,can anyone guide me what is going wrong here

Select case WHEN {{ Month }} = 0 then add_months(current_date(),-13 ) elseWHEN {{ Month }}> month(add_months(current_date(),-1)) then add_months(to_date(concat(year(current_date())-1,'-',{{Month}},'-',1)),-13)             else add_months(to_date(conc...

  • 40386 Views
  • 3 replies
  • 5 kudos
Latest Reply
Debayan
Databricks Employee
  • 5 kudos

Hi @Sajid Thavalengal Rahiman​ , Have you followed the recommendation given above? Also, could you please paste the whole error with the code?

  • 5 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels