cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

j_unspeakable
by New Contributor III
  • 1228 Views
  • 3 replies
  • 3 kudos

Resolved! Permission Denied when Creating External Tables Using Workspace Default Credential

I’m building out schemas, volumes, and external Delta tables in Unity Catalog via Terraform. The schemas and volumes are created successfully, but all external tables are failing.The error message from Terraform doesn't highlight what the issue is bu...

image.png image.png Screenshot 2025-06-15 152848.png
  • 1228 Views
  • 3 replies
  • 3 kudos
Latest Reply
artopihlaja
New Contributor
  • 3 kudos

Feature or bug, I discovered the same. I couldn't create tables with the default credential. To test, I assigned the default credential and a custom credential the same access rights to the storage container that is the target of the external locatio...

  • 3 kudos
2 More Replies
MarcoRezende
by New Contributor III
  • 121 Views
  • 1 replies
  • 1 kudos

Resolved! AttributeError: module 'numpy' has no attribute 'typing'

We started experiencing failures in several Databricks jobs without any changes on our side. The error occurs during Python job execution and seems related to package dependencies.The job error:Run failed with error message Cannot read the python fil...

  • 121 Views
  • 1 replies
  • 1 kudos
Latest Reply
MarcoRezende
New Contributor III
  • 1 kudos

The problem was numexpr lib version, 2.14.0, I needed to pin 2.13.1

  • 1 kudos
sta_gas
by New Contributor
  • 153 Views
  • 2 replies
  • 1 kudos

Resolved! Data profiling monitoring with foreign catalog

Hi team,I’m currently working with Azure Databricks and have created a foreign catalog for my source database in Azure SQL. I can successfully run SELECT statements from Databricks to the Azure SQL database.However, I would like to set up data profil...

sta_gas_0-1760357690503.png
  • 153 Views
  • 2 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @sta_gas ,Since data quality monitoring is in beta I'm quite sure they don't support foreign tables as of now (but they forgot to mentioned it in docs).But more important question if they ever will be supported. For me data quality monitoring appl...

  • 1 kudos
1 More Replies
adrianhernandez
by New Contributor III
  • 186 Views
  • 2 replies
  • 1 kudos

Wheel permissions issue

I get a : org.apache.spark.SparkSecurityException: [INSUFFICIENT_PERMISSIONS] Insufficient privileges: User does not have permission MODIFY,SELECT on any file. SQLSTATE: 42501 at com.databricks.sql.acl.Unauthorized.throwInsufficientPermissionsError(P...

  • 186 Views
  • 2 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

Hi @adrianhernandez ,  The permissions error indicates you need to have the privileges for "any file". To resolve this, Can you try by adding the corresponding permissions and see if it works: %sql GRANT SELECT ON ANY FILE TO `username` %sql GRANT MO...

  • 1 kudos
1 More Replies
Hritik_Moon
by New Contributor II
  • 191 Views
  • 5 replies
  • 8 kudos

Stop Cache in free edition

Hello,I am using databricks free edition, is there a way to turn off IO caching.I am trying to learn optimization and cant see any difference in query run time with caching enabled.

  • 191 Views
  • 5 replies
  • 8 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 8 kudos

Hi @Hritik_Moon ,I guess you cannot. To disable disk cache you need to have ability to run following command:spark.conf.set("spark.databricks.io.cache.enabled", "[true | false]")But serverless compute does not support setting most Spark properties fo...

  • 8 kudos
4 More Replies
jorperort
by Contributor
  • 2071 Views
  • 4 replies
  • 2 kudos

Resolved! Executing Bash Scripts or Binaries Directly in Databricks Jobs on Single Node Cluster

Hi,Is it possible to directly execute a Bash script or a binary executable from the operating system of a Databricks job compute node using a single node cluster?I’m using databricks asset bundels  for job initialization and execution. When the job s...

  • 2071 Views
  • 4 replies
  • 2 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 2 kudos

Hello @jorperort , I did some research internally and have some tips/suggestions for you to consider:   Based on the research and available documentation, it is not possible to directly execute a Bash script or binary executable from the operating sy...

  • 2 kudos
3 More Replies
Vsleg
by Contributor
  • 3175 Views
  • 5 replies
  • 0 kudos

Enabling enableChangeDataFeed on Streaming Table created in DLT

Hello, Can I enable Change Data Feed on Streaming Tables? How should I do this? I couldn't find this in the existing documentation https://learn.microsoft.com/en-us/azure/databricks/delta/delta-change-data-feed .

  • 3175 Views
  • 5 replies
  • 0 kudos
Latest Reply
saurabh18cs
Honored Contributor II
  • 0 kudos

Hi @Vsleg i think you cannot enable cdf like this for streaming tables. it is not natively supported for DLT streaming tables , please have a look here = Propagating Deletes: Managing Data Removal using D... - Databricks Community - 90978

  • 0 kudos
4 More Replies
Chris_N
by New Contributor
  • 201 Views
  • 3 replies
  • 1 kudos

Unable to configure clustering on DLT tables

Hi TeamI have a DLT pipeline with `cluster_by` property configured for all my tables. The code looks something like below:@Dlt.table( name="flows", cluster_by=["from"] ) def flows(): <LOGIC>It was all working fine and in couple of days, the queries w...

  • 201 Views
  • 3 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

Hi @Chris_N ,   You have mentioned - "I couldn't find any cluster properties configured." If they existed and were changed, you can use the delta history command to check if someone changed on the clustering information.  It is possible there were ch...

  • 1 kudos
2 More Replies
SuMiT1
by New Contributor III
  • 208 Views
  • 2 replies
  • 0 kudos

Unable to Create Secret Scope in Databricks – “Fetch request failed due to expired user session”

I’m trying to create an Azure Key Vault-backed Secret Scope in Databricks, but when I click Create, I get this error:Fetch request failed due to expired user sessionI’ve already verified my login, permissions. I also tried refreshing and re-signing i...

  • 208 Views
  • 2 replies
  • 0 kudos
Latest Reply
saurabh18cs
Honored Contributor II
  • 0 kudos

Hi @SuMiT1 are you using any iac tool like terraform etc. or you want to try out manually using your own identity?

  • 0 kudos
1 More Replies
georgemichael40
by New Contributor III
  • 121 Views
  • 1 replies
  • 1 kudos

Best approach for writing/updating delta tables from python?

Hi,We are migrating a local dash app to the Databricks infrastructure (using databricks apps and our delta lake).The local app does the following (among others):- takes Excel files form the end-user- read in-memory and transforms to pandas dataframe-...

  • 121 Views
  • 1 replies
  • 1 kudos
Latest Reply
saurabh18cs
Honored Contributor II
  • 1 kudos

Hi @georgemichael40 My suggestion wud be to try using MERGE INTO for delta tables which works with connector then using delete/insert statements. This will also keep your code in SQL as you wanted. your tables are not large so this shud be sufficient...

  • 1 kudos
Hritik_Moon
by New Contributor II
  • 76 Views
  • 1 replies
  • 2 kudos

Reading snappy.parquet

I stored a dataframe as delta in the catalog. It created multiple folders with snappy.parquet files. Is there a way to read these snappy.parquet files.it reads with pandas but with spark it gives error "incompatible format"

  • 76 Views
  • 1 replies
  • 2 kudos
Latest Reply
Khaja_Zaffer
Contributor III
  • 2 kudos

Hello good day @Hritik_Moon That incompatible format is expected as when you try to read in parquet because of presence of delta_log created with delta format which follows acid principals its like AnalysisException.recommended would be read in delta...

  • 2 kudos
AlanDanque
by New Contributor
  • 1281 Views
  • 2 replies
  • 0 kudos

Salesforce Bulk API 2.0 not getting all rows from large table

Has anyone run into an incomplete data extraction issue with the Salesforce Bulk API 2.0 where very large source object tables with more than 260k rows (s/b approx 13M) - result in only extracting approx 250k on attempt?

  • 1281 Views
  • 2 replies
  • 0 kudos
Latest Reply
ManojkMohan
Honored Contributor
  • 0 kudos

@AlanDanque I am working on a similar use case and will share screen shots shortlyBut to reach the root cause can you share the below detailsChecks at SalesforceDescriptionHeader used?Was Sforce-Enable-PKChunking: chunkSize=250000 explicitly included...

  • 0 kudos
1 More Replies
Nidhig
by Contributor
  • 125 Views
  • 1 replies
  • 1 kudos

Conversational Agent App integration with genie in Databricks

Hi,I have recently explore the feature of conversational agent app from marketplace integration with Genie Space.The connection setup went well but I could find sync issue between the app and genie space. Even after multiple deployment I couldn't see...

  • 125 Views
  • 1 replies
  • 1 kudos
Latest Reply
HariSankar
Contributor III
  • 1 kudos

Hi @Nidhig,This isn’t expected behavior,it usually happens when the app's service principal lacks permissions to access the SQL warehouse, Genie Space, or underlying Unity Catalog tables.Try these fixes:--> SQL Warehouse: Go to Compute -> SQL Warehou...

  • 1 kudos
Abrarali8708
by New Contributor II
  • 428 Views
  • 4 replies
  • 4 kudos

Resolved! Node type not available in Central India (Student Subscription)

Hi Community,I have deployed an Azure Databricks workspace in the Central India region using a student subscription. While trying to create a compute resource, I encountered an error stating that the selected node type is not available in Central Ind...

  • 428 Views
  • 4 replies
  • 4 kudos
Latest Reply
ManojkMohan
Honored Contributor
  • 4 kudos

@Abrarali8708  As discussed can you trymanaging the Azure Policy definition:Locate the policy definition ID /providers/Microsoft.Authorization/policyDefinitions/b86dabb9-b578-4d7b-b842-3b45e95769a1.Modify the parameter listOfAllowedLocations to inclu...

  • 4 kudos
3 More Replies
Michał
by New Contributor III
  • 776 Views
  • 5 replies
  • 3 kudos

how to process a streaming lakeflow declarative pipeline in batches

Hi, I've got a problem and I have run out of ideas as to what else I can try. Maybe you can help? I've got a delta table with hundreds millions of records on which I have to perform relatively expensive operations. I'd like to be able to process some...

  • 776 Views
  • 5 replies
  • 3 kudos
Latest Reply
mmayorga
Databricks Employee
  • 3 kudos

Hi @Michał , One detail/feature to consider when working with Declarative Pipelines is that they manage and auto-tune configuration aspects, including rate limiting (maxBytesPerTrigger or maxFilesPerTrigger). Perhaps that's why you could not see this...

  • 3 kudos
4 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels