cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Prabakar
by Esteemed Contributor III
  • 4160 Views
  • 2 replies
  • 7 kudos

Resolved! Library installation fails with mirror sync issue

While trying to install ffmpeg package using an init script on Databricks cluster, it fails with the below error.Init script:#! /bin/bash set -e sudo apt-get update sudo apt-get -y install ffmpegError message:E: Failed to fetch http://security.ubuntu...

  • 4160 Views
  • 2 replies
  • 7 kudos
Latest Reply
Prabakar
Esteemed Contributor III
  • 7 kudos

Cause: The VMs are pointing to the cached old mirror which is not up-to-date. Hence there is a problem with downloading the package and it's failing. Workaround: Use the below init script to install the package "ffmpeg". To revert to the original lis...

  • 7 kudos
1 More Replies
Sunny
by New Contributor III
  • 4701 Views
  • 8 replies
  • 4 kudos

Resolved! Retrieve job id and run id from scala

I need to retrieve job id and run id of the job from a jar file in Scala.When I try to compile below code in IntelliJ, below error is shown.import com.databricks.dbutils_v1.DBUtilsHolder.dbutils   object MainSNL {   @throws(classOf[Exception]) de...

  • 4701 Views
  • 8 replies
  • 4 kudos
Latest Reply
Mohit_m
Valued Contributor II
  • 4 kudos

Maybe its worth going through the Task Parameter variables section of the below dochttps://docs.databricks.com/data-engineering/jobs/jobs.html#task-parameter-variables

  • 4 kudos
7 More Replies
Mohit_m
by Valued Contributor II
  • 2969 Views
  • 1 replies
  • 2 kudos

Resolved! Databricks jobs create API throws unexpected error

Databricks jobs create API throws unexpected errorError response :{"error_code": "INVALID_PARAMETER_VALUE","message": "Cluster validation error: Missing required field: settings.cluster_spec.new_cluster.size"}Any idea on this?

  • 2969 Views
  • 1 replies
  • 2 kudos
Latest Reply
Mohit_m
Valued Contributor II
  • 2 kudos

Could you please specify num_workers in the json body and try API again.Also, another recommendation can be configuring what you want in UI, and then pressing “JSON” button that should show corresponding JSON which you can use for API

  • 2 kudos
lav
by New Contributor III
  • 576 Views
  • 1 replies
  • 1 kudos

Correlated Column Exception in Spark SQL

Hi Johan,Were you able to resolve the correlated column exception issue? I have been stuck on this since past week. If you can guide me that will be alot of help.Thanks.

  • 576 Views
  • 1 replies
  • 1 kudos
Latest Reply
Johan_Van_Noten
New Contributor III
  • 1 kudos

Seems to be a duplicate of your comment on https://community.databricks.com/s/question/0D53f00001XCuCACA1/correlated-column-exception-in-sql-udf-when-using-udf-parameters. I guess you did that to be able to put other tags?

  • 1 kudos
darshan
by New Contributor III
  • 10120 Views
  • 14 replies
  • 12 kudos

Resolved! Is there a way to run notebooks concurrently in same session?

tried using-dbutils.notebook.run(notebook.path, notebook.timeout, notebook.parameters)but it takes 20 seconds to start new session. %run uses same session but cannot figure out how to use it to run notebooks concurrently.

  • 10120 Views
  • 14 replies
  • 12 kudos
Latest Reply
rudesingh56
New Contributor II
  • 12 kudos

I’ve been struggling with opening multiple browser sessions to open more than one notebook at a time.

  • 12 kudos
13 More Replies
abd
by Contributor
  • 665 Views
  • 1 replies
  • 2 kudos

Resolved! Why use databricks over other tools ?

What is something special about databricks.What databricks provides that no other tool in the market provides ?How can I convince some other person to use databricks and not some other tool ?

  • 665 Views
  • 1 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Abdullah Durrani​, Please read these articles :-https://www.serveradminz.com/blog/databricks-an-advanced-analytics-solution/#:~:text=Databricks%20offers%20a%20highly%20secure,and%20share%20them%20across%20teams.https://www.bluegranite.com/blog/3-...

  • 2 kudos
zLiu
by New Contributor II
  • 481 Views
  • 1 replies
  • 1 kudos

Project lightspeed

It’s just a breeze for all the streaming users. What’s the best venue to learn more about it. Is there a Jira ticket that tracks all the progresses? also wonder which Spark version it will come with.

  • 481 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Zhengyi Liu​, Please have a look at this article to know all about Project Lightspeed.

  • 1 kudos
TheOptimizer
by Contributor
  • 6302 Views
  • 6 replies
  • 8 kudos

Resolved! How to create delta table with identity column.

I'm sure this is probably some oversight on my part, but I don't see it. I'm trying to create a delta table with an identity column. I've tried every combination of the syntax I can think of. %sql create or replace table IDS.picklist ( picklist_id...

Capture
  • 6302 Views
  • 6 replies
  • 8 kudos
Latest Reply
lucas_marchand
New Contributor III
  • 8 kudos

I was also having this same error and my cluster was running Databricks Runtime Version 9.1 so I changed it to 11.0 and it worked.

  • 8 kudos
5 More Replies
harrisriaz
by New Contributor
  • 1831 Views
  • 4 replies
  • 6 kudos

Resolved! what are the key Data engineering problems that databricks solve?

what are the problem that databricks address from typical data engineering prespective and comparing with other cloud DE tools.

  • 1831 Views
  • 4 replies
  • 6 kudos
Latest Reply
Kaniz
Community Manager
  • 6 kudos

Hi @Harris Riaz​, I appreciate your attempt to choose the best answer for us. I'm glad you got your query resolved. @rheiman, Thank you for giving an excellent answer .

  • 6 kudos
3 More Replies
spartakos
by New Contributor
  • 474 Views
  • 0 replies
  • 0 kudos

Big data ingest into Delta Lake

I have a feature table in BQ that I want to ingest into Delta Lake. This feature table in BQ has 100TB of data. This table can be partitioned by DATE.What best practices and approaches can I take to ingest this 100TB? In particular, what can I do to ...

  • 474 Views
  • 0 replies
  • 0 kudos
Niha1
by New Contributor III
  • 686 Views
  • 3 replies
  • 3 kudos
  • 686 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz
Community Manager
  • 3 kudos

Hi @Niharika Modi​, All materials covered in the course are available on-demand under the Resource tab within the system.My Agenda > select training > Check-in > Resources

  • 3 kudos
2 More Replies
Reabouri
by New Contributor
  • 639 Views
  • 2 replies
  • 1 kudos
  • 639 Views
  • 2 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Robin Sabouri​  , We haven’t heard from you on the last response from @Ralph David Lagos​ , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful t...

  • 1 kudos
1 More Replies
acenteno
by New Contributor II
  • 423 Views
  • 1 replies
  • 1 kudos

Can I use Spark to stream data from Kafka?

Can I use Spark to stream data from Kafka?

  • 423 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Anthony Centeno​, Please go through this article which explains your use case:-Spark Streaming + Kafka Integration Guide (Kafka broker version 0.10.0 or higher)

  • 1 kudos
Bomberone
by New Contributor II
  • 631 Views
  • 2 replies
  • 2 kudos

Resolved! Autoloader checkpoint issue

Hello guys, anyone issuing problems with autoloader checkpoints on azure?​

  • 631 Views
  • 2 replies
  • 2 kudos
Latest Reply
Kaniz
Community Manager
  • 2 kudos

Hi @Francesco Merangolo​, We haven’t heard from you on the last response from @Hubert Dudek​ , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please share it with the community as it can be helpful to...

  • 2 kudos
1 More Replies
Labels
Top Kudoed Authors