Databricks Community

Nasd_ · ‎06-18-2025

Hello,

I’m working on Databricks with a cluster running Runtime 16.4, which includes Spark 3.5.2 and Scala 2.12.

For a specific need, I want to implement my own custom way of writing to Delta tables by manually managing Delta transactions from PySpark. To do this, I want to access the Delta Lake transactional engine via the JVM embedded in the Spark session, specifically by using the class:

org.apache.spark.sql.delta.DeltaLog

Issue

When I try to use classes from the package org.apache.spark.sql.delta directly from PySpark (through spark._jvm), the classes are not found if I don’t have the Delta Core package installed explicitly on the cluster.

When I install the Delta Core Python package to gain access, I encounter the following Python import error:

ModuleNotFoundError: No module named 'delta.exceptions.captured'; 'delta.exceptions' is not a package

Without the Delta Core package installed, accessing DeltaLog simply returns a generic JavaPackage object that is unusable.

What I want to do Access the Delta transaction log API (DeltaLog) from PySpark via JVM.

Be able to start transactions and commit manually to implement custom write behavior.

Work within the Databricks Runtime 16.4 environment without conflicts or missing dependencies.

Questions

How can I correctly access and use org.apache.spark.sql.delta.DeltaLog from PySpark on Databricks Runtime 16.4?

Is there a supported way to manually manage Delta transactions through the JVM in this environment?

What is the correct setup or package dependency to avoid the ModuleNotFoundError when installing the Delta Core Python package?

Are there any alternatives or recommended patterns to achieve manual Delta commits programmatically on Databricks?

NandiniN · ‎10-04-2025

Hi @Nasd_,

I believe you are trying to use OSS jars on DBR. (Can infer based on class package)

org.apache.spark.sql.delta.DeltaLog

The error ModuleNotFoundError: No module named 'delta.exceptions.captured'; 'delta.exceptions' is not a package can be seen when installing the open-source delta-spark (or Delta Core) Python package would be a package conflict.

Databricks Runtime includes a native version of the Delta Lake Python libraries that are tightly coupled with the binaries on the cluster. When you install the open-source delta-spark package via %pip or as a cluster library, it often overwrites or conflicts with the native Databricks-provided modules, leading to the Python import error because the structure or contents of the installed package do not match what the Databricks environment expects.

Okay, I just see you have the answer on this thread - https://community.databricks.com/t5/data-engineering/accessing-deltalog-and-optimistictransaction-fr... and you have accepted the answer. So, I believe your questions are answered.

Thanks!

View solution in original post

NandiniN · ‎10-04-2025

Hi @Nasd_,

I believe you are trying to use OSS jars on DBR. (Can infer based on class package)

org.apache.spark.sql.delta.DeltaLog

The error ModuleNotFoundError: No module named 'delta.exceptions.captured'; 'delta.exceptions' is not a package can be seen when installing the open-source delta-spark (or Delta Core) Python package would be a package conflict.

Databricks Runtime includes a native version of the Delta Lake Python libraries that are tightly coupled with the binaries on the cluster. When you install the open-source delta-spark package via %pip or as a cluster library, it often overwrites or conflicts with the native Databricks-provided modules, leading to the Python import error because the structure or contents of the installed package do not match what the Databricks environment expects.

Okay, I just see you have the answer on this thread - https://community.databricks.com/t5/data-engineering/accessing-deltalog-and-optimistictransaction-fr... and you have accepted the answer. So, I believe your questions are answered.

Thanks!

Databricks Community

Unable to load org.apache.spark.sql.delta classes from JVM pyspark

Join Us as a Local Community Builder!

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

🌟 Community Pulse: Your Weekly Roundup! November 14 – 20, 2025

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples