cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

rammy
by Contributor III
  • 10353 Views
  • 6 replies
  • 5 kudos

How I could read the Job id, run id and parameters in python cell?

I have tried following ways to get job parameters but none of the things are working.runId='{{run_id}}' jobId='{{job_id}}' filepath='{{filepath}}' print(runId," ",jobId," ",filepath) r1=dbutils.widgets.get('{{run_id}}') f1=dbutils.widgets.get('{{file...

  • 10353 Views
  • 6 replies
  • 5 kudos
Latest Reply
Siete
New Contributor
  • 5 kudos

You should use {{job.id}} and {{job.run_id}} instead of with an underscore. This works for me.

  • 5 kudos
5 More Replies
ar45
by New Contributor II
  • 182 Views
  • 2 replies
  • 0 kudos

Resolved! DELTA_TXN_LOG_FAILED_INTEGRITY

Hi, Trying to use merge statement for a query and the error comes as shown below.  I am able to describe history on the table but not able to perform any operations like vaccum, restore, optimize and even merge. Tried dropping the external delta tabl...

  • 182 Views
  • 2 replies
  • 0 kudos
Latest Reply
mani_22
Databricks Employee
  • 0 kudos

Hi @ar45 , I am not sure what caused the corruption, but to resolve the issue, you can try removing the transaction log for the corrupt version 4574 (.json file for version 4574 under the _delta_log folder). If there are .crc files for version 4574, ...

  • 0 kudos
1 More Replies
utkarshamone
by New Contributor II
  • 543 Views
  • 4 replies
  • 0 kudos

Internal errors when running SQLs

We are running Databricks on GCP with a classic SQL warehouse. Its on the current version (v 2025.15)We have a pipeline that runs DBT on top of the SQL warehouseSince the 9th of May, our queries have been failing intermittently with internal errors f...

Screenshot 2025-05-15 at 4.51.49 pm.png Screenshot 2025-05-15 at 5.23.57 pm.png Screenshot 2025-05-15 at 5.24.12 pm.png
  • 543 Views
  • 4 replies
  • 0 kudos
Latest Reply
Isi
Contributor III
  • 0 kudos

Hi @utkarshamone ,We faced a similar issue and I wanted to share our findings, which might help clarify what’s going on.We’re using a Classic SQL Warehouse size L (v2025.15), and executing a dbt pipeline on top of it.Our dbt jobs started to fail with...

  • 0 kudos
3 More Replies
ncouture
by Contributor
  • 6303 Views
  • 4 replies
  • 1 kudos

Resolved! How to install a JAR library via a global init script?

I have a JAR I want to be installed as a library on all clusters. I have tried both wget /databricks/jars/ some_repoandcp /dbfs/FileStore/jars/name_of_jar.jar /databricks/jars/clusters start up but the JAR is not installed as a library. I am aware th...

  • 6303 Views
  • 4 replies
  • 1 kudos
Latest Reply
EliCunningham
New Contributor
  • 1 kudos

Ensure your init script installs the JAR correctly on cluster startup.

  • 1 kudos
3 More Replies
unnamedchunk
by New Contributor
  • 700 Views
  • 1 replies
  • 0 kudos

JVM Heap Leak When Iterating Over Large Number of Tables Using DESCRIBE DETAIL

Problem:I'm trying to generate a consolidated metadata table for all tables within a Databricks database (I do not have admin privileges). The process works fine for the first few thousand tables, but as it progresses, the driver node eventually cras...

spark_ui.png
  • 700 Views
  • 1 replies
  • 0 kudos
Latest Reply
cgrant
Databricks Employee
  • 0 kudos

It's best to iterate over information_schema's TABLES table instead of listing yourself.

  • 0 kudos
TamD
by Contributor
  • 328 Views
  • 5 replies
  • 1 kudos

Cannot apply liquid clustering via DLT pipeline

I want to use liquid clustering on a materialised view created via a DLT pipeline, however, there doesn't appear to be a valid way to do this.Via table properties:@Dlt.table( name="<table name>, comment="<table description", table_propert...

  • 328 Views
  • 5 replies
  • 1 kudos
Latest Reply
TamD
Contributor
  • 1 kudos

Thanks @aayrm5 . I want to use cluster by auto, because the data will get queried and aggregated several different ways by different business users.  I did try your code above anyway, specifying the columns to cluster by.  The pipeline ran without er...

  • 1 kudos
4 More Replies
ChandraR
by New Contributor
  • 84 Views
  • 1 replies
  • 0 kudos

Data Engineering Associate -13+ Years of SAP SD/OTC Experience -Data Engineering Associate

Hi DataBricks  ,This is Chandra ,I am adapting the world of data with the help of Data bricks .I need your help and advises to successfully get adapt the Databricks Engineer profile approach .I have enrolled myself in the Learning platform ,I need yo...

  • 84 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @ChandraR! Happy to help you get started on your Databricks journey! To begin, it's important to get familiar with the Databricks ecosystem, including key components like the Lakehouse architecture, Delta Lake, Apache Spark, and Unity Catalog. ...

  • 0 kudos
Divya_Bhadauria
by New Contributor II
  • 10742 Views
  • 5 replies
  • 2 kudos

Unable to run python script from git repo in Databricks job

I'm getting cannot read python file on running this job which is configured to run a python script from git repo. Run result unavailable: run failed with error message Cannot read the python file /Repos/.internal/7c39d645692_commits/ff669d089cd8f93e9...

  • 10742 Views
  • 5 replies
  • 2 kudos
Latest Reply
SakthiGanesh
New Contributor II
  • 2 kudos

Hi @Divya_Bhadauria, I'm facing the same internal commit issue from my end. I don't gave any internal path in the databricks workflow. I gave the source to azure DevOps services with branch name. But when I ran the workflow it gives the below error a...

  • 2 kudos
4 More Replies
AmanSehgal
by Honored Contributor III
  • 94 Views
  • 1 replies
  • 0 kudos

Column Name Case sensitivity in DLT pipeline

I've a DLT pipeline that processes messages from event grid. The schema of the message has two columns in different cases - "employee_id" and  "employee_ID",I tried setting spark.sql.caseSensitive to true in my DLT notebook as well in DLT configurati...

  • 94 Views
  • 1 replies
  • 0 kudos
Latest Reply
Renu_
Contributor
  • 0 kudos

Hi @AmanSehgal, DLT treat column names as case-insensitive, even if spark.sql.caseSensitive is set to true. That’s why employee_id and employee_ID are seen as duplicates and cause the error. To fix this, you’ll need to rename one of the columns so yo...

  • 0 kudos
sunday-okey
by New Contributor
  • 72 Views
  • 1 replies
  • 0 kudos

Introduction to Spark Lab

Hello, I got an error while accessing the Introduction to Spark Lab. Please see the error message below and resolve.", line 155, in do response = retryable(self._perform)(method, File "/voc/scripts/python/venv/lib/python3.10/site-packages/databricks/...

  • 72 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @sunday-okey! Apologies for the inconvenience.The issue has been resolved. Please try restarting the lab, it should be working as expected now.

  • 0 kudos
Einsatz
by New Contributor II
  • 108 Views
  • 0 replies
  • 0 kudos

Dataframe getting updated after creating temporary view

I'm observing different behavior between Databricks Runtime versions when working with DataFrames and temporary views, and would appreciate any clarification.In both environments, I performed the following steps in a notebook (each connected to its o...

  • 108 Views
  • 0 replies
  • 0 kudos
carlos_tasayco
by New Contributor III
  • 317 Views
  • 4 replies
  • 0 kudos

path-based access to a table with row filters or column masks is not supported

I have a delta table which I am applying masking to some columns, however, every time I want to refresh the table (overwrite) I cannot I receive this error:If I do what Assistant recommend me (If you remove the .option("path", DeltaZones))It worked b...

carlos_tasayco_0-1745443128070.png carlos_tasayco_1-1745443214501.png
  • 317 Views
  • 4 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

Are you using Unity Catalog?

  • 0 kudos
3 More Replies
cool_cool_cool
by New Contributor II
  • 1144 Views
  • 2 replies
  • 0 kudos

Databricks Workflow is stuck on the first task and doesnt do anyworkload

Heya I have a workflow in databricks with 2 tasks. They are configured to run on the same job cluster, and the second task depends on the first.I have a weird behavior that happened twice now - the job takes a long time (it usually finishes within 30...

  • 1144 Views
  • 2 replies
  • 0 kudos
Latest Reply
Sri_M
New Contributor
  • 0 kudos

@cool_cool_cool I am facing same issue as well.Is this issue resolved for you? If yes, can you please let me know what action have you taken?

  • 0 kudos
1 More Replies
Upendra_Dwivedi
by New Contributor III
  • 287 Views
  • 3 replies
  • 0 kudos

How to enable Databricks Apps User Authorization?

Hi All,I am working on implementation of user authorization in my databricks app. but to enable user auth it is asking:"A workspace admin must enable this feature to be able to request additional scopes. The user's API downscoped access token is incl...

  • 287 Views
  • 3 replies
  • 0 kudos
Latest Reply
SP_6721
New Contributor III
  • 0 kudos

Hi @Upendra_Dwivedi To enable this feature, you'll need to go to Apps in your workspace and turn on the On-Behalf-Of User Authorization option. After that, when you're creating or editing your app, make sure to select the necessary user API scopes, t...

  • 0 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels