cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

LearningDatabri
by Contributor II
  • 9361 Views
  • 7 replies
  • 2 kudos

Resolved! Unable to read file from S3

I tried to read a file from S3, but facing the below error:org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 53.0 failed 4 times, most recent failure: Lost task 0.3 in stage 53.0 (TID 82, xx.xx.xx.xx, executor 0): com...

  • 9361 Views
  • 7 replies
  • 2 kudos
Latest Reply
Sivaprasad1
Databricks Employee
  • 2 kudos

Which DBR version are you using? Could you please test it with a different DBR version probably DBR 9.x?

  • 2 kudos
6 More Replies
THIAM_HUATTAN
by Valued Contributor
  • 3532 Views
  • 4 replies
  • 6 kudos

Resolved! why this not able to go through?

https://textdoc.co/index.php/UFEQdwxWn60LtOVfError:https://textdoc.co/index.php/3JisnHKGkvLIaAOF

  • 3532 Views
  • 4 replies
  • 6 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 6 kudos

It would be best if you used Databricks ML runtime (in cluster settings), not the standard one.

  • 6 kudos
3 More Replies
THIAM_HUATTAN
by Valued Contributor
  • 2620 Views
  • 2 replies
  • 0 kudos

Resolved! Save data from Spark DataFrames to TFRecords

https://docs.microsoft.com/en-us/azure/databricks/_static/notebooks/deep-learning/tfrecords-save-load.htmlI could not run the Cell # 2java.lang.ClassNotFoundException: --------------------------------------------------------------------------- Py4JJ...

  • 2620 Views
  • 2 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Hi @THIAM HUAT TAN​,Which DBR version are you using? are you using the ML runtime?

  • 0 kudos
1 More Replies
User16826994223
by Databricks Employee
  • 2999 Views
  • 1 replies
  • 1 kudos

Unity Catalog will allow you to bring your own HMS

Anyone know more about how the Unity Catalog will allow you to bring your own HMS (eg Glue)?Will this be treated as a separate 'catalog', which you can access but you can't use the other features of Unity Catalog on eg ABAC etcAny reading on this top...

  • 2999 Views
  • 1 replies
  • 1 kudos
Latest Reply
zpappa
Databricks Employee
  • 1 kudos

@Kunal Gaurav​ yes, it is treated as a synthetic catalog. You can query it by using the convention "hive_metastore" as the catalog name. i.e. SELECT * FROM hive_metastore.schema_name.table_nameThis will work for internal HMS, external HMS and Glue.Yo...

  • 1 kudos
PP1
by New Contributor II
  • 3781 Views
  • 2 replies
  • 2 kudos
  • 3781 Views
  • 2 replies
  • 2 kudos
Latest Reply
zpappa
Databricks Employee
  • 2 kudos

@Prashanth P​ We offer a fully featured REST API with Unity Catalog that provides the ability to CRUD objects such as catalogs/schemas/tables/acls/lineage etc.Companies like Colliba/Alation/MS Purview etc use these in middleware integrations to integ...

  • 2 kudos
1 More Replies
OliverLewis
by New Contributor
  • 3183 Views
  • 2 replies
  • 1 kudos

Parallelize spark jobs on the same cluster?

Whats the best way to parallelize multiple spark jobs on the same cluster during a backfill?

  • 3183 Views
  • 2 replies
  • 1 kudos
Latest Reply
ron_defreitas
Contributor
  • 1 kudos

In the past I used direct multi-threaded orchestration inside of driver applications, but that was prior to Databricks supporting multi-task jobs.If you create a job through the workflows tab, you can set up multiple notebooks, python, or jar tasks t...

  • 1 kudos
1 More Replies
mortenhaga
by Contributor
  • 5229 Views
  • 4 replies
  • 4 kudos

Resolved! SQL Serverless Endpoint failing to start with Instance Profile

Hi allSuper stoked about the PP of SQL Serverless, but it does seem that the instance profile Im using doesnt have the required trust relationship for it to work with the Sererless Endpoint. Altough on "classic" mode it works fine. Does Serverless re...

image
  • 5229 Views
  • 4 replies
  • 4 kudos
Latest Reply
Anonymous
Not applicable
  • 4 kudos

Thank you for sharing your valuable solution, it's work properly.mymilestonecard

  • 4 kudos
3 More Replies
Prabakar
by Databricks Employee
  • 9494 Views
  • 2 replies
  • 7 kudos

Resolved! Library installation fails with mirror sync issue

While trying to install ffmpeg package using an init script on Databricks cluster, it fails with the below error.Init script:#! /bin/bash set -e sudo apt-get update sudo apt-get -y install ffmpegError message:E: Failed to fetch http://security.ubuntu...

  • 9494 Views
  • 2 replies
  • 7 kudos
Latest Reply
Prabakar
Databricks Employee
  • 7 kudos

Cause: The VMs are pointing to the cached old mirror which is not up-to-date. Hence there is a problem with downloading the package and it's failing. Workaround: Use the below init script to install the package "ffmpeg". To revert to the original lis...

  • 7 kudos
1 More Replies
Sunny
by New Contributor III
  • 9565 Views
  • 7 replies
  • 4 kudos

Resolved! Retrieve job id and run id from scala

I need to retrieve job id and run id of the job from a jar file in Scala.When I try to compile below code in IntelliJ, below error is shown.import com.databricks.dbutils_v1.DBUtilsHolder.dbutils   object MainSNL {   @throws(classOf[Exception]) de...

  • 9565 Views
  • 7 replies
  • 4 kudos
Latest Reply
Mohit_m
Databricks Employee
  • 4 kudos

Maybe its worth going through the Task Parameter variables section of the below dochttps://docs.databricks.com/data-engineering/jobs/jobs.html#task-parameter-variables

  • 4 kudos
6 More Replies
Mohit_m
by Databricks Employee
  • 5922 Views
  • 1 replies
  • 2 kudos

Resolved! Databricks jobs create API throws unexpected error

Databricks jobs create API throws unexpected errorError response :{"error_code": "INVALID_PARAMETER_VALUE","message": "Cluster validation error: Missing required field: settings.cluster_spec.new_cluster.size"}Any idea on this?

  • 5922 Views
  • 1 replies
  • 2 kudos
Latest Reply
Mohit_m
Databricks Employee
  • 2 kudos

Could you please specify num_workers in the json body and try API again.Also, another recommendation can be configuring what you want in UI, and then pressing “JSON” button that should show corresponding JSON which you can use for API

  • 2 kudos
lav
by New Contributor III
  • 1716 Views
  • 1 replies
  • 1 kudos

Correlated Column Exception in Spark SQL

Hi Johan,Were you able to resolve the correlated column exception issue? I have been stuck on this since past week. If you can guide me that will be alot of help.Thanks.

  • 1716 Views
  • 1 replies
  • 1 kudos
Latest Reply
Johan_Van_Noten
New Contributor III
  • 1 kudos

Seems to be a duplicate of your comment on https://community.databricks.com/s/question/0D53f00001XCuCACA1/correlated-column-exception-in-sql-udf-when-using-udf-parameters. I guess you did that to be able to put other tags?

  • 1 kudos
darshan
by New Contributor III
  • 26307 Views
  • 13 replies
  • 12 kudos

Resolved! Is there a way to run notebooks concurrently in same session?

tried using-dbutils.notebook.run(notebook.path, notebook.timeout, notebook.parameters)but it takes 20 seconds to start new session. %run uses same session but cannot figure out how to use it to run notebooks concurrently.

  • 26307 Views
  • 13 replies
  • 12 kudos
Latest Reply
rudesingh56
New Contributor II
  • 12 kudos

I’ve been struggling with opening multiple browser sessions to open more than one notebook at a time.

  • 12 kudos
12 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels