Data Engineering

Forum Posts

Sorted by:

by Leladams • New Contributor III

01-06-2022 9:33:14 AM

14299 Views
10 replies
2 kudos

What is the best way to read in a ms access .accdb database into Databricks from a mounted drive?

I am currently trying to read in .accdb files from a mounted drive. Based on my research it looks like I would have to use a package like JayDeBeApi with ucanaccess drivers or pyodbc with ms access drivers.Will this work?Thanks for any help.

Data Engineering

14299 Views
10 replies
2 kudos

01-06-2022 9:33:14 AM

View Replies

Latest Reply

Anonymous
Not applicable

04-13-2022 7:46:56 AM

2 kudos

Hi @Leland Adams Hope you are doing well. Thank you for posting your question and giving us additional information. Do you think you were able to solve the query?We'd love to hear from you.

2 kudos

04-13-2022 7:46:56 AM

9 More Replies

by DBU100725 • New Contributor II

10-07-2025 6:27:05 AM

396 Views
2 replies
1 kudos

URGENT: Delta writes to S3 fail after workspace migrated to Premium

Delta writes to S3 fail after workspace migrated to Premium (401 “Credential was not sent or unsupported type”)SummaryAfter our Databricks workspace migrated from Standard to Premium, all Delta writes to S3 started failing with:com.databricks.s3commi...

Data Engineering

396 Views
2 replies
1 kudos

10-07-2025 6:27:05 AM

View Replies

Latest Reply

dkushari
Databricks Employee

4 weeks ago

1 kudos

Hi @DBU100725 - Are you using a No isolation shared cluster? Can you check if this was turned ON in your account?

1 kudos

4 weeks ago

1 More Replies

by Shefali • New Contributor

08-29-2025 2:23:36 AM

537 Views
1 replies
1 kudos

Lakebridge conversion tool: Incorrect Databricks SQL script generated

Hi Team,I was able to successfully install and use the Lakebridge code conversion tool to convert my SQL Server script into a Databricks SQL script. However, the generated script contains several syntax errors. Could you please let me know if I might...

Data Engineering

537 Views
1 replies
1 kudos

08-29-2025 2:23:36 AM

View Replies

Latest Reply

AbhaySingh
Databricks Employee

4 weeks ago

1 kudos

Hi there!Known lakebase issues are listed here:https://github.com/databrickslabs/lakebridge/issuesDoes any of this apply to your use case?1. Variable scope errors in WHERE clauses or subqueries 2. DELETE/UPDATE FROM statements incorrectly converted ...

1 kudos

4 weeks ago

by Davila • New Contributor II

06-26-2025 10:54:59 AM

1749 Views
1 replies
1 kudos

Resolved! Issue with Root Folder Configuration in Databricks Asset Bundles for DLT Pipelines

I'm currently working with Databricks Asset Bundles to deploy my DLT pipelines, but I’ve encountered an issue I can't resolve.The problem is that I’m unable to configure the root folder within the Asset Bundle in a way that lets me define a custom pa...

Data Engineering

1749 Views
1 replies
1 kudos

06-26-2025 10:54:59 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

4 weeks ago

1 kudos

Hey @Davila , I did some digging and have come with some things you can think about as you work through your issue. Here’s a clear way to think about what you’re seeing and how to proceed. What’s going on That “Root folder” field in the DLT UI is in...

1 kudos

4 weeks ago

by lauraxyz • Contributor

01-22-2025 10:58:15 AM

2185 Views
6 replies
0 kudos

Notebook in path workspace/repos/.internal/_commits/ was unable to be accessed

I have a workflow job (source is git) to access a notebook and execute it. From the job, it failed with error:Py4JJavaError: An error occurred while calling o466.run. : com.databricks.WorkflowException: com.databricks.NotebookExecutionException: FAI...

Data Engineering

2185 Views
6 replies
0 kudos

01-22-2025 10:58:15 AM

View Replies

Latest Reply

lauraxyz
Contributor

01-24-2025 3:34:41 PM

0 kudos

Just some clarification: the caller notebook can be found with no issues, no matter the task's source is GIT or WORKSPACE. However, the callee notebook, which is called by the caller notebook with dbutils.notebook.run(), cannot be found if the call...

0 kudos

01-24-2025 3:34:41 PM

5 More Replies

by JordanYaker • Contributor

05-30-2023 11:47:28 AM

2445 Views
2 replies
0 kudos

Integration options for Databricks Jobs and DataDog?

I know that there is already the Databricks (technically Spark) integration for DataDog. Unfortunately, that integration only covers the cluster execution itself and that means only Cluster Metrics and Spark Jobs and Tasks. I'm looking for somethin...

Data Engineering

2445 Views
2 replies
0 kudos

05-30-2023 11:47:28 AM

View Replies

Latest Reply

greg-0935
New Contributor II

4 weeks ago

0 kudos

Personally, I'm using their Data Jobs Monitoring product https://docs.datadoghq.com/data_jobs/databricks/ that works great and gives the right insights both for my high level job execution stats and Spark deeper metrics

0 kudos

4 weeks ago

1 More Replies

by Dhruv-22 • Contributor II

4 weeks ago

365 Views
2 replies
1 kudos

Resolved! Can't mergeSchema handle int and bigint?

I have a table which has a column of data type 'bigint'. While overwriting it with new data, given that I do full loads, I used 'mergeSchema' to handle schema changes. The new data's datatype was int. I thought mergeSchema can easily handle that, but...

Data Engineering

365 Views
2 replies
1 kudos

4 weeks ago

View Replies

Latest Reply

Chiran-Gajula
New Contributor III

4 weeks ago

1 kudos

Hi Dhruv,Delta won't automatically upcast unless you explicitly handle it. Cast the column Lob_Pk to LongType (which maps to BIGINT in SQL/Delta). Try below snippetfrom pyspark.sql.functions import colfrom pyspark.sql.types import LongTypecrm_retail_...

1 kudos

4 weeks ago

1 More Replies

by Marthinus • New Contributor III

a month ago

296 Views
4 replies
2 kudos

Resolved! [Databricks Asset Bundles] Bug: driver_node_type_id not updated

Working with databricks asset bundles (using the new python-based definition), if you have a job_cluster defined using driver_node_type_id, and then update it to no longer have it defined, but only node_type_id, the driver node_type never gets update...

Data Engineering

296 Views
4 replies
2 kudos

a month ago

View Replies

Latest Reply

Chiran-Gajula
New Contributor III

4 weeks ago

2 kudos

There is no built-in way in Databricks Asset bundles or terraform to automatically inherit the value of driver_node_type_id for node_type_id."You must set both explicitly in your configuration"You can always see your updated detail resource from the ...

2 kudos

4 weeks ago

3 More Replies

by Dhruv-22 • Contributor II

04-03-2024 3:03:26 AM

1894 Views
2 replies
0 kudos

Resolved! Understanding least common type in databricks

I was reading the data type rules and found about least common type.I have a doubt. What is the least common type of STRING and INT? The referred link gives the following example saying the least common type is BIGINT.-- The least common type between...

Data Engineering

1894 Views
2 replies
0 kudos

04-03-2024 3:03:26 AM

View Replies

Latest Reply

Dhruv-22
Contributor II

4 weeks ago

0 kudos

The question is solved here - link

0 kudos

4 weeks ago

1 More Replies

by Dhruv-22 • Contributor II

4 weeks ago

343 Views
4 replies
4 kudos

Resolved! Least Common Type is different in Serverless and All Purpose Cluster.

The following statement gives different outputs in different computes.In Databricks, 15.4 LTS%sqlSELECT typeof(coalesce(5, '6'));-- OutputstringIn Serverless, environment version 4%sqlSELECT typeof(coalesce(5, '6'));-- OutputbigintThere are other cas...

Data Engineering

343 Views
4 replies
4 kudos

4 weeks ago

View Replies

Latest Reply

MuthuLakshmi
Databricks Employee

4 weeks ago

4 kudos

@Dhruv-22 Regarding your 1st question, I'm not sureYou can refer to https://docs.databricks.com/aws/en/sql/language-manual/parameters/ansi_mode#system-default to understand what happens when ansi mode is disabled

4 kudos

4 weeks ago

3 More Replies

by anusha98 • New Contributor II

4 weeks ago

257 Views
2 replies
3 kudos

Resolved! Regarding : How to use Row_number() in dlt pipelines

We have two streaming tables : customer_info and customer_info_history and we joined them using full join to create temp table in pyspark and now we want to eliminate the de-duped records from this temp table. Tried using row_number() but facing bel...

Data Engineering

257 Views
2 replies
3 kudos

4 weeks ago

View Replies

Latest Reply

K_Anudeep
Databricks Employee

4 weeks ago

3 kudos

Hello @anusha98 , You’re hitting a real limitation of Structured Streaming: non-time window functions (like row_number() over (...)) aren’t allowed on streaming DFs. You need to use agg().max() to get the “latest value per key” @dlt.table(name="temp_...

3 kudos

4 weeks ago

1 More Replies

by AmarKap • New Contributor

4 weeks ago

153 Views
1 replies
1 kudos

Lakeflow Pipelines Trying to Read accented file with spark.readStream but failure

Trying to read a accented file(French characters) but the spark.readStream function is not working and special characters turn into something strange(ex. �) spark.readStream .format("cloudfiles") .option("cloudFiles....

Data Engineering

153 Views
1 replies
1 kudos

4 weeks ago

View Replies

Latest Reply

K_Anudeep
Databricks Employee

4 weeks ago

1 kudos

Hello @AmarKap , When Spark decodes CP1252 bytes as UTF-8/ISO-8859-1, you’ll see the replacement char like � Can you read the file as : df = (spark.readStream.format("cloudFiles").option("cloudFiles.format", "text").option("encoding", "windows-1252")...

1 kudos

4 weeks ago

by EndreM • New Contributor III

06-05-2025 7:10:56 AM

2252 Views
1 replies
1 kudos

Replay stream to migrate to liquid cluster

The documentation is sparse about how to migrate a partition table to a liquid cluster as the Alter table suggested in the documentation doesnt work when its a partitioned table.The comments on this forum suggest replaying the stream. And this is wha...

Data Engineering

2252 Views
1 replies
1 kudos

06-05-2025 7:10:56 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

4 weeks ago

1 kudos

Greetings @EndreM , I did some digging internally and I have come up with some helpful tips/tricks to help guide you through this issue: Based on your situation, you're encountering several common challenges when migrating a partitioned table to liqu...

1 kudos

4 weeks ago

by soumiknow • Contributor II

07-02-2025 1:19:59 AM

2377 Views
1 replies
1 kudos

Unable to create databricks group and add permission via terraform

I have the following terraform code to create a databricks group and add permission to a workflow: resource "databricks_group" "dbx_group" { display_name = "ENV_MONITORING_TEAM" } resource "databricks_permissions" "workflow_permission" { job_id ...

Data Engineering

databricks groups

Terraform

2377 Views
1 replies
1 kudos

07-02-2025 1:19:59 AM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

4 weeks ago

1 kudos

Greetings @soumiknow , I did some digging internally and found something that may help: Based on the information gathered, I can now draft a comprehensive response to this Databricks Community question about the Terraform authentication issue. ## Dra...

1 kudos

4 weeks ago

by smoortema • Contributor

10-06-2025 3:22:17 AM

324 Views
2 replies
2 kudos

Resolved! How to make FOR cycle and dynamic SQL and variables work together

I am working on a testing notebook where the table that is tested can be given as a widget. I wanted to write it in SQL. The notebook does the following steps in a cycle that should run 10 times:1. Store the starting version of a delta table in a var...

Data Engineering

324 Views
2 replies
2 kudos

10-06-2025 3:22:17 AM

View Replies

Latest Reply

smoortema
Contributor

4 weeks ago

2 kudos

Thank you! I realised that the example I gave was bad. However, what I was missing is that I did not know how to set a variable in SQL scripting. Including the SET command within the sql string does not work, you have to use the EXECUTE IMMEDIATE ......

2 kudos

4 weeks ago

1 More Replies

Databricks Community

Forum Posts

What is the best way to read in a ms access .accdb database into Databricks from a mounted drive?

URGENT: Delta writes to S3 fail after workspace migrated to Premium

Lakebridge conversion tool: Incorrect Databricks SQL script generated

Resolved! Issue with Root Folder Configuration in Databricks Asset Bundles for DLT Pipelines

Notebook in path workspace/repos/.internal/_commits/ was unable to be accessed

Integration options for Databricks Jobs and DataDog?

Resolved! Can't mergeSchema handle int and bigint?

Resolved! [Databricks Asset Bundles] Bug: driver_node_type_id not updated

Resolved! Understanding least common type in databricks

Resolved! Least Common Type is different in Serverless and All Purpose Cluster.

Resolved! Regarding : How to use Row_number() in dlt pipelines

Lakeflow Pipelines Trying to Read accented file with spark.readStream but failure

Replay stream to migrate to liquid cluster

Unable to create databricks group and add permission via terraform

Resolved! How to make FOR cycle and dynamic SQL and variables work together

Join Us as a Local Community Builder!

No rows returned when calling Databricks procedure...

Trouble Enabling File Events For An External Locat...

Want to use DataFrame equality functions but also ...

Loading CSV from private S3 bucket

DATA_SOURCE_NOT_FOUND Error with MongoDB (Suggesti...