cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

RK_AV
by New Contributor III
  • 4765 Views
  • 5 replies
  • 8 kudos

Autoloader cluster

I wanted to setup Autoloader to process files from Azure Data Lake (Blob) automatically whenever new files arrive. For this to work, I wanted to know if AutoLoader requires that the cluster is on all the time.

  • 4765 Views
  • 5 replies
  • 8 kudos
Latest Reply
asif5494
New Contributor III
  • 8 kudos

@Kaniz Fatma​ , If my cluster is not active, and I have uploaded 50 files in storage location, then where this Auto Loader will list out these 50 files. Will it use any checkpoint location, if yes, then how can I set the checkpoint location in Cloud ...

  • 8 kudos
4 More Replies
133994
by New Contributor III
  • 2577 Views
  • 6 replies
  • 4 kudos

Resolved! Databricks Certified Data Engineer Associate Certificate not received

Hello,I passed Databricks Certified Data Engineer Associate on 30 October 2022, but still didn't receive my certificate/badge. Could you please help me to obtain it?Regards,Ali

  • 2577 Views
  • 6 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Hi @ali.ganbarov ali.ganbarov​ We are really sorry for the delays.The certification has been issued but due to the lag in the system, it is taking time. Our team is working on it. Please visit the credible site once.Thanks and Regards

  • 4 kudos
5 More Replies
129876
by New Contributor III
  • 5475 Views
  • 4 replies
  • 7 kudos

Schedule job runs with different parameters?

Is it possible to schedule different runs for job with parameters? I have a notebook that generates data based on the supplied parameter but would like to schedule runs instead of manually starting them. I assume that this would be possible using the...

  • 5475 Views
  • 4 replies
  • 7 kudos
Latest Reply
Debayan
Databricks Employee
  • 7 kudos

You can pass parameters for your task. Each task type has different requirements for formatting and passing the parameters. https://docs.databricks.com/workflows/jobs/jobs.html#create-a-jobREST API can also pass parameters fro jobs. Tokens replace pa...

  • 7 kudos
3 More Replies
Sandy21
by New Contributor III
  • 2279 Views
  • 1 replies
  • 2 kudos

Resolved! Cluster Configuration Best Practices

I have a cluster with the configuration of 400 GB RAM, 160 Cores.Which of the following would be the ideal configuration to use in case of one or more VM failures?Cluster A: Total RAM 400 GB      Total Cores 160   Total VMs: 1   400 GB/Exec & 160 c...

  • 2279 Views
  • 1 replies
  • 2 kudos
Latest Reply
karthik_p
Esteemed Contributor
  • 2 kudos

@Santhosh Raj​ can you please confirm cluster sizes you are taking are related to driver and worker node. how much you want to allocate to Driver and Worker? once we are sure about type of driver and worker we would like to pick, we need to enable au...

  • 2 kudos
lzha174
by Contributor
  • 5375 Views
  • 3 replies
  • 3 kudos

Resolved! ipywidgets stopped displaying today

everything was working yesterday, but today it stopped working as below: The example from the DB website does not work either with the same error. The page source says  This is affecting my work~~~a bit annoying, is DB people going to look into this ...

image image
  • 5375 Views
  • 3 replies
  • 3 kudos
Latest Reply
lzha174
Contributor
  • 3 kudos

Today its back to work! I got a pop up window sayingthis should be the reason it was broken

  • 3 kudos
2 More Replies
Dinu2
by New Contributor III
  • 2767 Views
  • 0 replies
  • 3 kudos

Cassandra connection from ADB

Hi , Could anyone please help to know the steps for connecting Cassandra from Azure Databricks? I have followed the steps in https://learn.microsoft.com/en-us/azure/databricks/_static/notebooks/azure/cassandra-azure.html But I am getting below error....

  • 2767 Views
  • 0 replies
  • 3 kudos
Sandy21
by New Contributor III
  • 1238 Views
  • 0 replies
  • 0 kudos

Databricks Jobs

Consider User A has deployed the job to prod. User B has scheduled the job through an external orchestration tool.User C has got the owner privileges from User A. Whose email id would be displayed while running the databricks job?

  • 1238 Views
  • 0 replies
  • 0 kudos
Constantino
by New Contributor III
  • 2118 Views
  • 2 replies
  • 2 kudos

CMK for managed services automatic rotation

The docs for the CMK for workspace storage states:After you add a customer-managed key for storage, you cannot later rotate the key by setting a different key ARN for the workspace. However, AWS provides automatic CMK master key rotation, which rotat...

  • 2118 Views
  • 2 replies
  • 2 kudos
Latest Reply
Debayan
Databricks Employee
  • 2 kudos

Hi @Constantino Schillebeeckx​ , You can update/rotate CMK at a later time (on a running workspace). Please refer: https://docs.databricks.com/security/keys/customer-managed-keys-managed-services-aws.html?_ga=2.214562071.1895504292.1667411694-6435253...

  • 2 kudos
1 More Replies
BorislavBlagoev
by Valued Contributor III
  • 30699 Views
  • 33 replies
  • 14 kudos
  • 30699 Views
  • 33 replies
  • 14 kudos
Latest Reply
bhuvahh
New Contributor II
  • 14 kudos

I think plain python code will run with databricks connect (if it is a python program you are writing), and spark sql can be done by spark.sql(...).

  • 14 kudos
32 More Replies
jgsp
by New Contributor II
  • 2257 Views
  • 2 replies
  • 1 kudos

Can't import st_constructors module after installing Apache Sedona

Hi there,I've recently installed Apache Sedona on my cluster, according to the detailed instructions here. My Databricks runtime version is 9.1 LTS (includes Apache Spark 3.1.2, Scala 2.12).The installation included the apache-sedona library from PyP...

  • 2257 Views
  • 2 replies
  • 1 kudos
Latest Reply
jgsp
New Contributor II
  • 1 kudos

Thank you @Debayan Mukherjee​ for the prompt reply. I've followed the instructions carefully, but now every time I try to run a cell in my notebook I get a "Cancelled" message. It clearly didn't work. Any advice?Your help is much appreciated.

  • 1 kudos
1 More Replies
HQJaTu
by New Contributor III
  • 2165 Views
  • 2 replies
  • 2 kudos

Custom container doesn't launch systemd

Quite soon after moving from VMs to containers, I started crafting my own images. That way notebooks have all the necessary libraries already there and no need to do any Pipping/installing in the notebook.As requirements get more complex, now I'm at ...

  • 2165 Views
  • 2 replies
  • 2 kudos
Latest Reply
Debayan
Databricks Employee
  • 2 kudos

Hi @Jari Turkia​ , Please check if this helps: https://developers.redhat.com/blog/2019/04/24/how-to-run-systemd-in-a-container#other_cool_features_about_podman_and_systemdAlso, you can run ubuntu /redhat linux OS inside containers which will have sys...

  • 2 kudos
1 More Replies
Sandy21
by New Contributor III
  • 12030 Views
  • 2 replies
  • 6 kudos

Schema Evolution Issue in Streaming

When there is a schema change while reading and writing to a stream, will the schema changes be automatically handled by sparkor do we need to include the option(mergeschema=True)?Eg:df.writeStream .option("mergeSchema", "true") .format("delta") .out...

  • 12030 Views
  • 2 replies
  • 6 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 6 kudos

mergeSchema doesn't support all operations. In some cases .option("overwriteSchema", "true") is needed. MergeSchema doesn't support:Dropping a columnChanging an existing column's data type (in place)Renaming column names that differ only by case (e.g...

  • 6 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels