Data Engineering

Forum Posts

Sorted by:

by j_unspeakable • New Contributor III

06-15-2025 7:32:16 AM

1228 Views
3 replies
3 kudos

Resolved! Permission Denied when Creating External Tables Using Workspace Default Credential

I’m building out schemas, volumes, and external Delta tables in Unity Catalog via Terraform. The schemas and volumes are created successfully, but all external tables are failing.The error message from Terraform doesn't highlight what the issue is bu...

Data Engineering

1228 Views
3 replies
3 kudos

06-15-2025 7:32:16 AM

View Replies

Latest Reply

artopihlaja
New Contributor

Monday

3 kudos

Feature or bug, I discovered the same. I couldn't create tables with the default credential. To test, I assigned the default credential and a custom credential the same access rights to the storage container that is the target of the external locatio...

3 kudos

Monday

2 More Replies

by MarcoRezende • New Contributor III

Monday

121 Views
1 replies
1 kudos

Resolved! AttributeError: module 'numpy' has no attribute 'typing'

We started experiencing failures in several Databricks jobs without any changes on our side. The error occurs during Python job execution and seems related to package dependencies.The job error:Run failed with error message Cannot read the python fil...

Data Engineering

121 Views
1 replies
1 kudos

Monday

View Replies

Latest Reply

MarcoRezende
New Contributor III

Monday

1 kudos

The problem was numexpr lib version, 2.14.0, I needed to pin 2.13.1

1 kudos

Monday

by sta_gas • New Contributor

Monday

153 Views
2 replies
1 kudos

Resolved! Data profiling monitoring with foreign catalog

Hi team,I’m currently working with Azure Databricks and have created a foreign catalog for my source database in Azure SQL. I can successfully run SELECT statements from Databricks to the Azure SQL database.However, I would like to set up data profil...

Data Engineering

153 Views
2 replies
1 kudos

Monday

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

Monday

1 kudos

Hi @sta_gas ,Since data quality monitoring is in beta I'm quite sure they don't support foreign tables as of now (but they forgot to mentioned it in docs).But more important question if they ever will be supported. For me data quality monitoring appl...

1 kudos

Monday

1 More Replies

by adrianhernandez • New Contributor III

Friday

186 Views
2 replies
1 kudos

Wheel permissions issue

I get a : org.apache.spark.SparkSecurityException: [INSUFFICIENT_PERMISSIONS] Insufficient privileges: User does not have permission MODIFY,SELECT on any file. SQLSTATE: 42501 at com.databricks.sql.acl.Unauthorized.throwInsufficientPermissionsError(P...

Data Engineering

186 Views
2 replies
1 kudos

Friday

View Replies

Latest Reply

NandiniN
Databricks Employee

Friday

1 kudos

Hi @adrianhernandez , The permissions error indicates you need to have the privileges for "any file". To resolve this, Can you try by adding the corresponding permissions and see if it works: %sql GRANT SELECT ON ANY FILE TO `username` %sql GRANT MO...

1 kudos

Friday

1 More Replies

by Hritik_Moon • New Contributor II

Monday

191 Views
5 replies
8 kudos

Stop Cache in free edition

Hello,I am using databricks free edition, is there a way to turn off IO caching.I am trying to learn optimization and cant see any difference in query run time with caching enabled.

Data Engineering

191 Views
5 replies
8 kudos

Monday

View Replies

Latest Reply

szymon_dybczak
Esteemed Contributor III

Monday

8 kudos

Hi @Hritik_Moon ,I guess you cannot. To disable disk cache you need to have ability to run following command:spark.conf.set("spark.databricks.io.cache.enabled", "[true | false]")But serverless compute does not support setting most Spark properties fo...

8 kudos

Monday

4 More Replies

by jorperort • Contributor

07-04-2025 12:51:09 PM

2071 Views
4 replies
2 kudos

Resolved! Executing Bash Scripts or Binaries Directly in Databricks Jobs on Single Node Cluster

Hi,Is it possible to directly execute a Bash script or a binary executable from the operating system of a Databricks job compute node using a single node cluster?I’m using databricks asset bundels for job initialization and execution. When the job s...

Data Engineering

2071 Views
4 replies
2 kudos

07-04-2025 12:51:09 PM

View Replies

Latest Reply

Louis_Frolio
Databricks Employee

2 weeks ago

2 kudos

Hello @jorperort , I did some research internally and have some tips/suggestions for you to consider: Based on the research and available documentation, it is not possible to directly execute a Bash script or binary executable from the operating sy...

2 kudos

2 weeks ago

3 More Replies

by Vsleg • Contributor

03-13-2024 7:14:10 AM

3175 Views
5 replies
0 kudos

Enabling enableChangeDataFeed on Streaming Table created in DLT

Hello, Can I enable Change Data Feed on Streaming Tables? How should I do this? I couldn't find this in the existing documentation https://learn.microsoft.com/en-us/azure/databricks/delta/delta-change-data-feed .

Data Engineering

3175 Views
5 replies
0 kudos

03-13-2024 7:14:10 AM

View Replies

Latest Reply

saurabh18cs
Honored Contributor II

Monday

0 kudos

Hi @Vsleg i think you cannot enable cdf like this for streaming tables. it is not natively supported for DLT streaming tables , please have a look here = Propagating Deletes: Managing Data Removal using D... - Databricks Community - 90978

0 kudos

Monday

4 More Replies

by Chris_N • New Contributor

Friday

201 Views
3 replies
1 kudos

Unable to configure clustering on DLT tables

Hi TeamI have a DLT pipeline with `cluster_by` property configured for all my tables. The code looks something like below:@Dlt.table( name="flows", cluster_by=["from"] ) def flows(): <LOGIC>It was all working fine and in couple of days, the queries w...

Data Engineering

201 Views
3 replies
1 kudos

Friday

View Replies

Latest Reply

NandiniN
Databricks Employee

Friday

1 kudos

Hi @Chris_N , You have mentioned - "I couldn't find any cluster properties configured." If they existed and were changed, you can use the delta history command to check if someone changed on the clustering information. It is possible there were ch...

1 kudos

Friday

2 More Replies

by SuMiT1 • New Contributor III

Saturday

208 Views
2 replies
0 kudos

Unable to Create Secret Scope in Databricks – “Fetch request failed due to expired user session”

I’m trying to create an Azure Key Vault-backed Secret Scope in Databricks, but when I click Create, I get this error:Fetch request failed due to expired user sessionI’ve already verified my login, permissions. I also tried refreshing and re-signing i...

Data Engineering

208 Views
2 replies
0 kudos

Saturday

View Replies

Latest Reply

saurabh18cs
Honored Contributor II

Monday

0 kudos

Hi @SuMiT1 are you using any iac tool like terraform etc. or you want to try out manually using your own identity?

0 kudos

Monday

1 More Replies

by georgemichael40 • New Contributor III

Sunday

121 Views
1 replies
1 kudos

Best approach for writing/updating delta tables from python?

Hi,We are migrating a local dash app to the Databricks infrastructure (using databricks apps and our delta lake).The local app does the following (among others):- takes Excel files form the end-user- read in-memory and transforms to pandas dataframe-...

Data Engineering

121 Views
1 replies
1 kudos

Sunday

View Replies

Latest Reply

saurabh18cs
Honored Contributor II

Monday

1 kudos

Hi @georgemichael40 My suggestion wud be to try using MERGE INTO for delta tables which works with connector then using delete/insert statements. This will also keep your code in SQL as you wanted. your tables are not large so this shud be sufficient...

1 kudos

Monday

by Hritik_Moon • New Contributor II

Monday

76 Views
1 replies
2 kudos

Reading snappy.parquet

I stored a dataframe as delta in the catalog. It created multiple folders with snappy.parquet files. Is there a way to read these snappy.parquet files.it reads with pandas but with spark it gives error "incompatible format"

Data Engineering

76 Views
1 replies
2 kudos

Monday

View Replies

Latest Reply

Khaja_Zaffer
Contributor III

Monday

2 kudos

Hello good day @Hritik_Moon That incompatible format is expected as when you try to read in parquet because of presence of delta_log created with delta format which follows acid principals its like AnalysisException.recommended would be read in delta...

2 kudos

Monday

by AlanDanque • New Contributor

07-08-2025 7:46:16 AM

1281 Views
2 replies
0 kudos

Salesforce Bulk API 2.0 not getting all rows from large table

Has anyone run into an incomplete data extraction issue with the Salesforce Bulk API 2.0 where very large source object tables with more than 260k rows (s/b approx 13M) - result in only extracting approx 250k on attempt?

Data Engineering

1281 Views
2 replies
0 kudos

07-08-2025 7:46:16 AM

View Replies

Latest Reply

ManojkMohan
Honored Contributor

Sunday

0 kudos

@AlanDanque I am working on a similar use case and will share screen shots shortlyBut to reach the root cause can you share the below detailsChecks at SalesforceDescriptionHeader used?Was Sforce-Enable-PKChunking: chunkSize=250000 explicitly included...

0 kudos

Sunday

1 More Replies

by Nidhig • Contributor

Sunday

125 Views
1 replies
1 kudos

Conversational Agent App integration with genie in Databricks

Hi,I have recently explore the feature of conversational agent app from marketplace integration with Genie Space.The connection setup went well but I could find sync issue between the app and genie space. Even after multiple deployment I couldn't see...

Data Engineering

125 Views
1 replies
1 kudos

Sunday

View Replies

Latest Reply

HariSankar
Contributor III

Sunday

1 kudos

Hi @Nidhig,This isn’t expected behavior,it usually happens when the app's service principal lacks permissions to access the SQL warehouse, Genie Space, or underlying Unity Catalog tables.Try these fixes:--> SQL Warehouse: Go to Compute -> SQL Warehou...

1 kudos

Sunday

by Abrarali8708 • New Contributor II

2 weeks ago

428 Views
4 replies
4 kudos

Resolved! Node type not available in Central India (Student Subscription)

Hi Community,I have deployed an Azure Databricks workspace in the Central India region using a student subscription. While trying to create a compute resource, I encountered an error stating that the selected node type is not available in Central Ind...

Data Engineering

428 Views
4 replies
4 kudos

2 weeks ago

View Replies

Latest Reply

ManojkMohan
Honored Contributor

a week ago

4 kudos

@Abrarali8708 As discussed can you trymanaging the Azure Policy definition:Locate the policy definition ID /providers/Microsoft.Authorization/policyDefinitions/b86dabb9-b578-4d7b-b842-3b45e95769a1.Modify the parameter listOfAllowedLocations to inclu...

4 kudos

a week ago

3 More Replies

by Michał • New Contributor III

09-03-2025 6:41:10 AM

776 Views
5 replies
3 kudos

how to process a streaming lakeflow declarative pipeline in batches

Hi, I've got a problem and I have run out of ideas as to what else I can try. Maybe you can help? I've got a delta table with hundreds millions of records on which I have to perform relatively expensive operations. I'd like to be able to process some...

Data Engineering

776 Views
5 replies
3 kudos

09-03-2025 6:41:10 AM

View Replies

Latest Reply

mmayorga
Databricks Employee

4 weeks ago

3 kudos

Hi @Michał , One detail/feature to consider when working with Declarative Pipelines is that they manage and auto-tune configuration aspects, including rate limiting (maxBytesPerTrigger or maxFilesPerTrigger). Perhaps that's why you could not see this...

3 kudos

4 weeks ago

4 More Replies

Databricks Community

Forum Posts

Resolved! Permission Denied when Creating External Tables Using Workspace Default Credential

Resolved! AttributeError: module 'numpy' has no attribute 'typing'

Resolved! Data profiling monitoring with foreign catalog

Wheel permissions issue

Stop Cache in free edition

Resolved! Executing Bash Scripts or Binaries Directly in Databricks Jobs on Single Node Cluster

Enabling enableChangeDataFeed on Streaming Table created in DLT

Unable to configure clustering on DLT tables

Unable to Create Secret Scope in Databricks – “Fetch request failed due to expired user session”

Best approach for writing/updating delta tables from python?

Reading snappy.parquet

Salesforce Bulk API 2.0 not getting all rows from large table

Conversational Agent App integration with genie in Databricks

Resolved! Node type not available in Central India (Student Subscription)

how to process a streaming lakeflow declarative pipeline in batches

Join Us as a Local Community Builder!

Python Wheel in Serverless Job in DAB

Lost access to Databricks account console on Free ...

Set default tblproperties for pipeline

AttributeError: module 'numpy' has no attribute 't...

Error occurs on create materialized view with spar...