Databricks Community

jomt · ‎08-09-2023

I have a set of database-files (.db) which I need to read into my Python Notebook in Databricks. I managed to do this fairly simple up until July when a update in SQLite JDBC library was introduced.

Up until now I have read the files in question with this (modified) code:

`df = spark.read.format("jdbc").options(url='<url>',

dbtable='<tablename>',

driver="org.sqlite.JDBC").load()`

However, after the update the data that is being read in is completely wrong (e.g. numeric columns with non-negative numbers, all of a sudden contains some negative numbers very different from the real value of the files).

Is there a better way to read in the .db files in the new SQLite JDBC 3.42.0.0 upgrade?

jomt · ‎08-10-2023

When the numbers in the table are really big (millions and billions) or really low (e.g. 1e-15), SQLite JDBC may struggle to import the correct values. To combat this, a good idea could be to use customSchema in options to define the schema using Decimals with a high range (or many decimals when numbers are really low).

`df = spark.read.format("jdbc").options(url='<url>',

dbtable='<tablename>',

driver="org.sqlite.JDBC",

customSchema="<col1> DECIMAL(38, 0), <col2> DECIMAL(38, 0), <col3> DECIMAL(38, 0)"
).load()`

View solution in original post

jomt · ‎08-10-2023

When the numbers in the table are really big (millions and billions) or really low (e.g. 1e-15), SQLite JDBC may struggle to import the correct values. To combat this, a good idea could be to use customSchema in options to define the schema using Decimals with a high range (or many decimals when numbers are really low).

`df = spark.read.format("jdbc").options(url='<url>',

dbtable='<tablename>',

driver="org.sqlite.JDBC",

customSchema="<col1> DECIMAL(38, 0), <col2> DECIMAL(38, 0), <col3> DECIMAL(38, 0)"
).load()`

Databricks Community

How do you properly read database-files (.db) with Spark in Python after the JDBC update?

Join Us as a Local Community Builder!

Solution Accelerator Series | #5 - Automating Product Review Summarization with LLMs

The next BrickTalks about the latest and greatest in AI/BI is scheduled for Oct 28!

🚀 Weekly Delta (8 - 14 October): A Look Back at This Week’s Top Community Highlights

BrickCon 2025 — Dec 3–5 | A Community Conference for Databricks Builders

🌟 Community Sparks of the Week | September 26 – October 2 🌟