Data Engineering

Forum Posts

Sorted by:

by Prabakar • Esteemed Contributor III

10-25-2021 9:03:24 AM

4160 Views
2 replies
7 kudos

Resolved! Library installation fails with mirror sync issue

While trying to install ffmpeg package using an init script on Databricks cluster, it fails with the below error.Init script:#! /bin/bash set -e sudo apt-get update sudo apt-get -y install ffmpegError message:E: Failed to fetch http://security.ubuntu...

Data Engineering

4160 Views
2 replies
7 kudos

10-25-2021 9:03:24 AM

View Replies

Latest Reply

Prabakar
Esteemed Contributor III

10-25-2021 9:08:38 AM

7 kudos

Cause: The VMs are pointing to the cached old mirror which is not up-to-date. Hence there is a problem with downloading the package and it's failing. Workaround: Use the below init script to install the package "ffmpeg". To revert to the original lis...

7 kudos

10-25-2021 9:08:38 AM

1 More Replies

by Sunny • New Contributor III

06-08-2022 10:55:55 AM

4701 Views
8 replies
4 kudos

Resolved! Retrieve job id and run id from scala

I need to retrieve job id and run id of the job from a jar file in Scala.When I try to compile below code in IntelliJ, below error is shown.import com.databricks.dbutils_v1.DBUtilsHolder.dbutils object MainSNL { @throws(classOf[Exception]) de...

Data Engineering

4701 Views
8 replies
4 kudos

06-08-2022 10:55:55 AM

View Replies

Latest Reply

Mohit_m
Valued Contributor II

07-05-2022 3:08:22 AM

4 kudos

Maybe its worth going through the Task Parameter variables section of the below dochttps://docs.databricks.com/data-engineering/jobs/jobs.html#task-parameter-variables

4 kudos

07-05-2022 3:08:22 AM

7 More Replies

by Mohit_m • Valued Contributor II

07-05-2022 3:03:27 AM

2969 Views
1 replies
2 kudos

Resolved! Databricks jobs create API throws unexpected error

Databricks jobs create API throws unexpected errorError response :{"error_code": "INVALID_PARAMETER_VALUE","message": "Cluster validation error: Missing required field: settings.cluster_spec.new_cluster.size"}Any idea on this?

Data Engineering

2969 Views
1 replies
2 kudos

07-05-2022 3:03:27 AM

View Replies

Latest Reply

Mohit_m
Valued Contributor II

07-05-2022 3:04:20 AM

2 kudos

Could you please specify num_workers in the json body and try API again.Also, another recommendation can be configuring what you want in UI, and then pressing “JSON” button that should show corresponding JSON which you can use for API

2 kudos

07-05-2022 3:04:20 AM

by lav • New Contributor III

07-04-2022 10:24:57 PM

576 Views
1 replies
1 kudos

Correlated Column Exception in Spark SQL

Hi Johan,Were you able to resolve the correlated column exception issue? I have been stuck on this since past week. If you can guide me that will be alot of help.Thanks.

Data Engineering

576 Views
1 replies
1 kudos

07-04-2022 10:24:57 PM

View Replies

Latest Reply

Johan_Van_Noten
New Contributor III

07-05-2022 12:24:11 AM

1 kudos

Seems to be a duplicate of your comment on https://community.databricks.com/s/question/0D53f00001XCuCACA1/correlated-column-exception-in-sql-udf-when-using-udf-parameters. I guess you did that to be able to put other tags?

1 kudos

07-05-2022 12:24:11 AM

by darshan • New Contributor III

06-27-2022 6:29:42 AM

10120 Views
14 replies
12 kudos

Resolved! Is there a way to run notebooks concurrently in same session?

tried using-dbutils.notebook.run(notebook.path, notebook.timeout, notebook.parameters)but it takes 20 seconds to start new session. %run uses same session but cannot figure out how to use it to run notebooks concurrently.

Data Engineering

10120 Views
14 replies
12 kudos

06-27-2022 6:29:42 AM

View Replies

Latest Reply

rudesingh56
New Contributor II

07-04-2022 6:49:31 AM

12 kudos

I’ve been struggling with opening multiple browser sessions to open more than one notebook at a time.

12 kudos

07-04-2022 6:49:31 AM

13 More Replies

by abd • Contributor

06-30-2022 7:32:17 PM

665 Views
1 replies
2 kudos

Resolved! Why use databricks over other tools ?

What is something special about databricks.What databricks provides that no other tool in the market provides ?How can I convince some other person to use databricks and not some other tool ?

Data Engineering

665 Views
1 replies
2 kudos

06-30-2022 7:32:17 PM

View Replies

Latest Reply

Kaniz
Community Manager

07-04-2022 3:51:24 AM

2 kudos

Hi @Abdullah Durrani, Please read these articles :-https://www.serveradminz.com/blog/databricks-an-advanced-analytics-solution/#:~:text=Databricks%20offers%20a%20highly%20secure,and%20share%20them%20across%20teams.https://www.bluegranite.com/blog/3-...

2 kudos

07-04-2022 3:51:24 AM

by Data_and_analyt • New Contributor II

06-28-2022 7:20:26 PM

481 Views
1 replies
2 kudos

How to build a great community?

Data Engineering

481 Views
1 replies
2 kudos

06-28-2022 7:20:26 PM

View Replies

Latest Reply

Kaniz
Community Manager

07-04-2022 2:57:37 AM

2 kudos

Hi @Maricela Castillo, Can you elaborate more on your question?

2 kudos

07-04-2022 2:57:37 AM

by zLiu • New Contributor II

06-29-2022 3:22:38 PM

481 Views
1 replies
1 kudos

Project lightspeed

It’s just a breeze for all the streaming users. What’s the best venue to learn more about it. Is there a Jira ticket that tracks all the progresses? also wonder which Spark version it will come with.

Data Engineering

481 Views
1 replies
1 kudos

06-29-2022 3:22:38 PM

View Replies

Latest Reply

Kaniz
Community Manager

07-04-2022 2:55:50 AM

1 kudos

Hi @Zhengyi Liu, Please have a look at this article to know all about Project Lightspeed.

1 kudos

07-04-2022 2:55:50 AM

by TheOptimizer • Contributor

06-01-2022 11:10:35 AM

6302 Views
6 replies
8 kudos

Resolved! How to create delta table with identity column.

I'm sure this is probably some oversight on my part, but I don't see it. I'm trying to create a delta table with an identity column. I've tried every combination of the syntax I can think of. %sql create or replace table IDS.picklist ( picklist_id...

Data Engineering

6302 Views
6 replies
8 kudos

06-01-2022 11:10:35 AM

View Replies

Latest Reply

lucas_marchand
New Contributor III

06-30-2022 11:03:28 AM

8 kudos

I was also having this same error and my cluster was running Databricks Runtime Version 9.1 so I changed it to 11.0 and it worked.

8 kudos

06-30-2022 11:03:28 AM

5 More Replies

by harrisriaz • New Contributor

06-28-2022 10:26:42 PM

1831 Views
4 replies
6 kudos

Resolved! what are the key Data engineering problems that databricks solve?

what are the problem that databricks address from typical data engineering prespective and comparing with other cloud DE tools.

Data Engineering

1831 Views
4 replies
6 kudos

06-28-2022 10:26:42 PM

View Replies

Latest Reply

Kaniz
Community Manager

07-01-2022 12:52:21 AM

6 kudos

Hi @Harris Riaz, I appreciate your attempt to choose the best answer for us. I'm glad you got your query resolved. @rheiman, Thank you for giving an excellent answer .

6 kudos

07-01-2022 12:52:21 AM

3 More Replies

by spartakos • New Contributor

06-30-2022 8:29:42 AM

474 Views
0 replies
0 kudos

Big data ingest into Delta Lake

I have a feature table in BQ that I want to ingest into Delta Lake. This feature table in BQ has 100TB of data. This table can be partitioned by DATE.What best practices and approaches can I take to ingest this 100TB? In particular, what can I do to ...

Data Engineering

474 Views
0 replies
0 kudos

06-30-2022 8:29:42 AM

by Niha1 • New Contributor III

06-27-2022 1:54:19 PM

686 Views
3 replies
3 kudos

Resolved! Can you please share the link for the lab excercise fro the ongoing summit?

Data Engineering

686 Views
3 replies
3 kudos

06-27-2022 1:54:19 PM

View Replies

Latest Reply

Kaniz
Community Manager

06-30-2022 7:23:49 AM

3 kudos

Hi @Niharika Modi, All materials covered in the course are available on-demand under the Resource tab within the system.My Agenda > select training > Check-in > Resources

3 kudos

06-30-2022 7:23:49 AM

2 More Replies

by Reabouri • New Contributor

06-28-2022 6:23:42 PM

639 Views
2 replies
1 kudos

Resolved! How to ensure data security in delta lake?

Data Engineering

639 Views
2 replies
1 kudos

06-28-2022 6:23:42 PM

View Replies

Latest Reply

Kaniz
Community Manager

06-30-2022 7:04:41 AM

1 kudos

Hi @Robin Sabouri , We haven’t heard from you on the last response from @Ralph David Lagos , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful t...

1 kudos

06-30-2022 7:04:41 AM

1 More Replies

by acenteno • New Contributor II

06-28-2022 4:00:54 PM

423 Views
1 replies
1 kudos

Can I use Spark to stream data from Kafka?

Data Engineering

423 Views
1 replies
1 kudos

06-28-2022 4:00:54 PM

View Replies

Latest Reply

Kaniz
Community Manager

06-30-2022 7:00:28 AM

1 kudos

Hi @Anthony Centeno, Please go through this article which explains your use case:-Spark Streaming + Kafka Integration Guide (Kafka broker version 0.10.0 or higher)

1 kudos

06-30-2022 7:00:28 AM

by Bomberone • New Contributor II

06-28-2022 2:35:38 PM

631 Views
2 replies
2 kudos

Resolved! Autoloader checkpoint issue

Hello guys, anyone issuing problems with autoloader checkpoints on azure?

Data Engineering

631 Views
2 replies
2 kudos

06-28-2022 2:35:38 PM

View Replies

Latest Reply

Kaniz
Community Manager

06-30-2022 6:34:43 AM

2 kudos

Hi @Francesco Merangolo, We haven’t heard from you on the last response from @Hubert Dudek , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to...

2 kudos

06-30-2022 6:34:43 AM

1 More Replies

User

Count

1602

736

343

284

247

Databricks

Forum Posts

Resolved! Library installation fails with mirror sync issue

Resolved! Retrieve job id and run id from scala

Resolved! Databricks jobs create API throws unexpected error

Correlated Column Exception in Spark SQL

Resolved! Is there a way to run notebooks concurrently in same session?

Resolved! Why use databricks over other tools ?

How to build a great community?

Project lightspeed

Resolved! How to create delta table with identity column.

Resolved! what are the key Data engineering problems that databricks solve?

Big data ingest into Delta Lake

Resolved! Can you please share the link for the lab excercise fro the ongoing summit?

Resolved! How to ensure data security in delta lake?

Can I use Spark to stream data from Kafka?

Resolved! Autoloader checkpoint issue

Best way to parse Google Analytics data in Databri...

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...