cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

pokus
by New Contributor III
  • 9217 Views
  • 3 replies
  • 2 kudos

Resolved! use DeltaLog class in databricks cluster

I need to use DeltaLog class in the code to get the AddFiles dataset. I have to keep the implemented code in a repo and run it in databricks cluster. Some docs say to use org.apache.spark.sql.delta.DeltaLog class, but it seems databricks gets rid of ...

  • 9217 Views
  • 3 replies
  • 2 kudos
Latest Reply
NandiniN
Databricks Employee
  • 2 kudos

Hi @pokus ,  You don't need to access via reflection.  You can Access DeltaLog with spark._jvm:Unity Catalog and DeltaLake tables expose their metadata and transaction log via the JVM backend. Using spark._jvm, you can interact with DeltaLog Thanks!

  • 2 kudos
2 More Replies
Nasd_
by New Contributor II
  • 688 Views
  • 3 replies
  • 0 kudos

Resolved! Accessing DeltaLog and OptimisticTransaction from PySpark

Hi community,I'm exploring ways to perform low-level, programmatic operations on Delta tables directly from a PySpark environment.The standard delta.tables.DeltaTable Python API is excellent for high-level DML, but it seems to abstract away the core ...

  • 688 Views
  • 3 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

For accessing the Databricks pre-installed package's use spark._jvm.com.databricks.sql.transaction.tahoe.DeltaLog  org.apache.spark.sql.delta.DeltaLog would be the OSS jar's classname.  

  • 0 kudos
2 More Replies
Nasd_
by New Contributor II
  • 1772 Views
  • 1 replies
  • 0 kudos

Unable to load org.apache.spark.sql.delta classes from JVM pyspark

Hello,I’m working on Databricks with a cluster running Runtime 16.4, which includes Spark 3.5.2 and Scala 2.12.For a specific need, I want to implement my own custom way of writing to Delta tables by manually managing Delta transactions from PySpark....

  • 1772 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @Nasd_,  I believe you are trying to use OSS jars on DBR. (Can infer based on class package) org.apache.spark.sql.delta.DeltaLog The error ModuleNotFoundError: No module named 'delta.exceptions.captured'; 'delta.exceptions' is not a package can be...

  • 0 kudos
LeoRickli
by New Contributor II
  • 727 Views
  • 2 replies
  • 0 kudos

Databricks Asset Bundles fails deploy but works on the GUI with same parameters

I'm running into an issue when running databricks bundle deploy when using job clusters.When I run databricks bundle deploy on a new workspace or after destroying previous resources, the deployment fails with the error: Error: cannot update job: At l...

  • 727 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hello @LeoRickli  Are you setting apply_policy_default_values? https://docs.databricks.com/en/administration-guide/clusters/policies.html#:~:text=Default%20values%20don't%20automatically,not%20needed%20for%20fixed%20policies. After you update a polic...

  • 0 kudos
1 More Replies
shanisolomonron
by New Contributor II
  • 155 Views
  • 4 replies
  • 1 kudos

Table ID not preserved using CREATE OR REPLACE TABLE

The When to replace a table documentation states that using CREATE OR REPLACE TABLE should preserve the table’s identity:Table contents are replaced, but the table identity is maintained.However, in my recent test the table ID changed after running t...

  • 155 Views
  • 4 replies
  • 1 kudos
Latest Reply
shanisolomonron
New Contributor II
  • 1 kudos

Waiting for a databricks employee to clarify this item. 

  • 1 kudos
3 More Replies
benesq
by New Contributor
  • 1464 Views
  • 1 replies
  • 0 kudos

JDBC driver uses Unsafe API, which will be completely deprecated in a future release of Java

Using JDBC driver (2.7.3) in OpenJDK 24 gives the following warning:WARNING: A terminally deprecated method in sun.misc.Unsafe has been called WARNING: sun.misc.Unsafe::arrayBaseOffset has been called by com.databricks.client.jdbc42.internal.apache.a...

  • 1464 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hey @benesq ,  For JDBC driver 2.7.4 https://www.databricks.com/spark/jdbc-drivers-download should be used with Java Runtime Environment (JRE) 8.0, 11.0 or 21.0. As mentioned in the installation doc "Each machine where you use the Databricks JDBC Dri...

  • 0 kudos
ClintHall
by New Contributor
  • 35 Views
  • 1 replies
  • 0 kudos

Error filtering by datetime Lakehouse Federated SQL Server table

In unity catalog, I have a connection to a SQL Server database. When I try to filter by a datetime column using a datetime with fractional seconds, Databricks gives me this error:Job aborted due to stage failure: com.microsoft.sqlserver.jdbc.SQLServe...

  • 35 Views
  • 1 replies
  • 0 kudos
Latest Reply
Isi
Honored Contributor II
  • 0 kudos

Hello @ClintHall I believe you’re running into a mismatch between how Spark/Databricks generates the literal and what SQL Server datetime can store.In SQL Server, the datetime type only supports milliseconds (3 decimal places). Docs When you pass a P...

  • 0 kudos
Hari_P
by New Contributor II
  • 28 Views
  • 1 replies
  • 1 kudos

Sharing Databricks Notebook Functionality Without Revealing Source Code

Hi All,I have a unique scenario in Databricks and would appreciate your insights.I’ve developed functionality in Databricks notebooks, and I’d like to share this with other developers within the same workspace. My goal is to allow colleagues to impor...

  • 28 Views
  • 1 replies
  • 1 kudos
Latest Reply
Isi
Honored Contributor II
  • 1 kudos

Hey @Hari_P ,I believe this doesn’t exist today as a built-in feature. I reviewed the Databricks notebook permission model (docs link) and with the minimum level (“CAN READ”) users already have access to view the notebook’s source. The simplest and m...

  • 1 kudos
AlbertWang
by Valued Contributor
  • 3805 Views
  • 7 replies
  • 3 kudos

Resolved! Azure Databricks Unity Catalog - cannot access managed volume in notebook

We have set up Azure Databricks with Unity Catalog (Metastore).Used Managed Identity (Databricks Access Connector) for connection from workspace(s) to ADLS Gen2ADLS Gen2 storage account has Storage Blob Data Contributor and Storage Queue Data Contrib...

  • 3805 Views
  • 7 replies
  • 3 kudos
Latest Reply
fifata
New Contributor II
  • 3 kudos

@AlbertWang @VAMSaha22 Since you want private connectivity I assume you have a vnet and a PE associated with the gen2 account. That PE needs to have a sub-resource of type dfs when the storage account is gen2/hierarchical namespace. You might want to...

  • 3 kudos
6 More Replies
Mildred
by New Contributor
  • 1786 Views
  • 1 replies
  • 0 kudos

Parameter "expand_tasks" on List job runs request seams not to be working (databricsk api)

I'm setting it as True, but it doesn't return the cluster_instance info. Here is the function I'm using:def get_job_runs(job_id): """ Fetches job runs for a specific job from Databricks Jobs API. """ headers = { "Authorization...

  • 1786 Views
  • 1 replies
  • 0 kudos
Latest Reply
Krishna_S
Databricks Employee
  • 0 kudos

Hi @Mildred  The way you passed the data for the expand_tasks parameter is wrong: data = { data = { "job_id": job_id, "expand_tasks": "true" } It should not be passed as Python boolean values, but as a string "true" or "false" Once you do that will...

  • 0 kudos
jorperort
by Contributor
  • 1828 Views
  • 3 replies
  • 0 kudos

Executing Bash Scripts or Binaries Directly in Databricks Jobs on Single Node Cluster

Hi,Is it possible to directly execute a Bash script or a binary executable from the operating system of a Databricks job compute node using a single node cluster?I’m using databricks asset bundels  for job initialization and execution. When the job s...

  • 1828 Views
  • 3 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Hello @jorperort , I did some research internally and have some tips/suggestions for you to consider:   Based on the research and available documentation, it is not possible to directly execute a Bash script or binary executable from the operating sy...

  • 0 kudos
2 More Replies
Pratikmsbsvm
by Contributor
  • 1556 Views
  • 1 replies
  • 0 kudos

How to Read and Wrire Data between 2 seperate instance of Databricks

How to Read and Wrire Data between 2 seperate instance of Databricks.I want to have bi-directional data read and write between Databricks A and Databricks B. Both are not in same instance.Please help

Pratikmsbsvm_0-1752575827266.png
  • 1556 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Hello @Pratikmsbsvm , I want to better understand what you mean by "Instance"?  Do you mean two seperate workspace within the same ADB account or do you mean two different ADB accounts?  Please clarify so I can provide guidance. Regards, Louis.

  • 0 kudos
data-grassroots
by New Contributor III
  • 87 Views
  • 3 replies
  • 0 kudos

ExcelWriter and local files

I have a couple things going on here.First, to explain what I'm doing, I'm passing an array of objects in to a function that contain a dataframe per item. I want to write those dataframes to an excel workbook - one dataframe per worksheet. That part ...

  • 87 Views
  • 3 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

Hey @data-grassroots , I did some digging with our internal docs and have some suggestions/tips to help you further diagnose the issue:   You're following the recommended Databricks approach for editing Excel files: copying the template to a local pa...

  • 0 kudos
2 More Replies
ralphchan
by New Contributor II
  • 3315 Views
  • 4 replies
  • 0 kudos

Connect Oracle Fusion (ERP / HCM) to Databricks

Any suggestion to connect Oracle Fusion (ERP/HCM) to Databricks?I have explored a few options including the use of Oracle Integration Cloud but it requires a lot of customization.

  • 3315 Views
  • 4 replies
  • 0 kudos
Latest Reply
nayan_wylde
Honored Contributor III
  • 0 kudos

I used Fivetran Oracle Fusion Connector in past. It is a fully managed ELT connector that extracts data from Oracle Fusion and loads it into Databricks.

  • 0 kudos
3 More Replies
cpollock
by New Contributor III
  • 70 Views
  • 2 replies
  • 0 kudos

Resolved! Getting NO_TABLES_IN_PIPELINE error in Lakeflow Declarative Pipelines

Yesterday (10/1) starting around 12 PM EST we starting getting the following error in our Lakeflow Declarative Pipelines (LDP) process.  We get this in environments where none of our code has changed.  I found some info on the serverless compute abou...

  • 70 Views
  • 2 replies
  • 0 kudos
Latest Reply
saurabh18cs
Honored Contributor II
  • 0 kudos

Hi @cpollock Check the “Event log” and “Pipeline logs” in the Databricks UI for any clues.also can you please share screenshot as pasted in window, attachment is not really working and only scanning

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels