cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

AlanDanque
by New Contributor
  • 1223 Views
  • 2 replies
  • 0 kudos

Salesforce Bulk API 2.0 not getting all rows from large table

Has anyone run into an incomplete data extraction issue with the Salesforce Bulk API 2.0 where very large source object tables with more than 260k rows (s/b approx 13M) - result in only extracting approx 250k on attempt?

  • 1223 Views
  • 2 replies
  • 0 kudos
Latest Reply
ManojkMohan
Honored Contributor
  • 0 kudos

@AlanDanque I am working on a similar use case and will share screen shots shortlyBut to reach the root cause can you share the below detailsChecks at SalesforceDescriptionHeader used?Was Sforce-Enable-PKChunking: chunkSize=250000 explicitly included...

  • 0 kudos
1 More Replies
Nidhig
by Contributor
  • 52 Views
  • 1 replies
  • 0 kudos

Conversational Agent App integration with genie in Databricks

Hi,I have recently explore the feature of conversational agent app from marketplace integration with Genie Space.The connection setup went well but I could find sync issue between the app and genie space. Even after multiple deployment I couldn't see...

  • 52 Views
  • 1 replies
  • 0 kudos
Latest Reply
HariSankar
Contributor III
  • 0 kudos

Hi @Nidhig,This isn’t expected behavior,it usually happens when the app's service principal lacks permissions to access the SQL warehouse, Genie Space, or underlying Unity Catalog tables.Try these fixes:--> SQL Warehouse: Go to Compute -> SQL Warehou...

  • 0 kudos
SuMiT1
by New Contributor III
  • 86 Views
  • 1 replies
  • 0 kudos

Unable to Create Secret Scope in Databricks – “Fetch request failed due to expired user session”

I’m trying to create an Azure Key Vault-backed Secret Scope in Databricks, but when I click Create, I get this error:Fetch request failed due to expired user sessionI’ve already verified my login, permissions. I also tried refreshing and re-signing i...

  • 86 Views
  • 1 replies
  • 0 kudos
Latest Reply
ManojkMohan
Honored Contributor
  • 0 kudos

@SuMiT1  Root Cause:The Databricks workspace UI requires an active authentication session for sensitive operations like creating Secret Scopes.Extended browser inactivity, resulting in token expirBrowser cache interfering with refresh token mechanism...

  • 0 kudos
saicharandeepb
by New Contributor III
  • 106 Views
  • 1 replies
  • 0 kudos

Capturing Streaming Metrics in Near Real-Time Using Cluster Logs

Over the past few weeks, I’ve been exploring ways to capture streaming metrics from our data load jobs. The goal is to monitor job performance and behavior in real time, without disrupting our existing data load pipelines.Initial Exploration: Streami...

saicharandeepb_0-1760081131866.png
  • 106 Views
  • 1 replies
  • 0 kudos
Latest Reply
Krishna_S
Databricks Employee
  • 0 kudos

Hi @saicharandeepb  Good job on doing such detailed research on monitoring structured streaming. If you need lower latency than rolling log permits, then have you tried this:Cluster-wide listener injection: Use spark.extraListeners to register a cust...

  • 0 kudos
Abrarali8708
by New Contributor II
  • 293 Views
  • 4 replies
  • 2 kudos

Resolved! Node type not available in Central India (Student Subscription)

Hi Community,I have deployed an Azure Databricks workspace in the Central India region using a student subscription. While trying to create a compute resource, I encountered an error stating that the selected node type is not available in Central Ind...

  • 293 Views
  • 4 replies
  • 2 kudos
Latest Reply
ManojkMohan
Honored Contributor
  • 2 kudos

@Abrarali8708  As discussed can you trymanaging the Azure Policy definition:Locate the policy definition ID /providers/Microsoft.Authorization/policyDefinitions/b86dabb9-b578-4d7b-b842-3b45e95769a1.Modify the parameter listOfAllowedLocations to inclu...

  • 2 kudos
3 More Replies
Chris_N
by New Contributor
  • 107 Views
  • 2 replies
  • 0 kudos

Unable to configure clustering on DLT tables

Hi TeamI have a DLT pipeline with `cluster_by` property configured for all my tables. The code looks something like below:@Dlt.table( name="flows", cluster_by=["from"] ) def flows(): <LOGIC>It was all working fine and in couple of days, the queries w...

  • 107 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @Chris_N ,   You have mentioned - "I couldn't find any cluster properties configured." If they existed and were changed, you can use the delta history command to check if someone changed on the clustering information.  It is possible there were ch...

  • 0 kudos
1 More Replies
adrianhernandez
by New Contributor III
  • 72 Views
  • 1 replies
  • 0 kudos

Wheel permissions issue

I get a : org.apache.spark.SparkSecurityException: [INSUFFICIENT_PERMISSIONS] Insufficient privileges: User does not have permission MODIFY,SELECT on any file. SQLSTATE: 42501 at com.databricks.sql.acl.Unauthorized.throwInsufficientPermissionsError(P...

  • 72 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi @adrianhernandez ,  The permissions error indicates you need to have the privileges for "any file". To resolve this, Can you try by adding the corresponding permissions and see if it works: %sql GRANT SELECT ON ANY FILE TO `username` %sql GRANT MO...

  • 0 kudos
Vsleg
by Contributor
  • 3075 Views
  • 4 replies
  • 0 kudos

Enabling enableChangeDataFeed on Streaming Table created in DLT

Hello, Can I enable Change Data Feed on Streaming Tables? How should I do this? I couldn't find this in the existing documentation https://learn.microsoft.com/en-us/azure/databricks/delta/delta-change-data-feed .

  • 3075 Views
  • 4 replies
  • 0 kudos
Latest Reply
john77
New Contributor II
  • 0 kudos

I have noticed the same issue.

  • 0 kudos
3 More Replies
Michał
by New Contributor III
  • 649 Views
  • 5 replies
  • 2 kudos

how to process a streaming lakeflow declarative pipeline in batches

Hi, I've got a problem and I have run out of ideas as to what else I can try. Maybe you can help? I've got a delta table with hundreds millions of records on which I have to perform relatively expensive operations. I'd like to be able to process some...

  • 649 Views
  • 5 replies
  • 2 kudos
Latest Reply
mmayorga
Databricks Employee
  • 2 kudos

Hi @Michał , One detail/feature to consider when working with Declarative Pipelines is that they manage and auto-tune configuration aspects, including rate limiting (maxBytesPerTrigger or maxFilesPerTrigger). Perhaps that's why you could not see this...

  • 2 kudos
4 More Replies
Data_NXT
by New Contributor III
  • 470 Views
  • 3 replies
  • 3 kudos

Resolved! To change ownership of a materialized view

 working in a Unity Catalog-enabled Databricks workspace, and we have several materialized views (MVs) that were created through a Delta Live Tables (DLT) / Lakeflow pipeline.Currently, the original owner of the pipeline has moved out of the project,...

  • 470 Views
  • 3 replies
  • 3 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 3 kudos

Hi @Data_NXT ,You can change the owner of a materialized view if you are a both a metastore admin and a workspace admin.Use the following steps to change a materialized views owner:Open the materialized view in Catalog Explorer, then on the Overview ...

  • 3 kudos
2 More Replies
Hritik_Moon
by New Contributor II
  • 206 Views
  • 2 replies
  • 1 kudos

Save as Delta file in catalog

Hello, I have created data frame on csv file when I try to write it as:df_op_clean.write.format("delta").save("/Volumes/optimisation/trial")I get this error :Cannot access the UC Volume path from this location. Path was /Volumes/optimisation/trial/_d...

  • 206 Views
  • 2 replies
  • 1 kudos
Latest Reply
-werners-
Esteemed Contributor III
  • 1 kudos

Also to add on this:avoid overlap between tables and Volumes.Create a separate folder for tables and files.Unity catalog does this too if you use managed tables/volumes.

  • 1 kudos
1 More Replies
mbanxp
by New Contributor III
  • 161 Views
  • 2 replies
  • 1 kudos

Most suitable Data Promotion orchestration for multi-tenant data lake in Databricks

Hi there !!! I would like to find the most suitable orchestration process to promote data between medallion layers I need to solve the following key architectural decision for scaling my multi-tenant data lake in Databricks.My setup:Independent medal...

  • 161 Views
  • 2 replies
  • 1 kudos
Latest Reply
sarahbhord
Databricks Employee
  • 1 kudos

Hey mbanxp! The most scalable and maintainable orchestration pattern for multi-tenant medallion architectures in Databricks is to build independent pipelines per table for all clients, with each pipeline parameterized by client/tenant. Why this appro...

  • 1 kudos
1 More Replies
jeremy98
by Honored Contributor
  • 793 Views
  • 6 replies
  • 1 kudos

How to reference a workflow to use multiple GIT sources?

Hi community,Is it possible for a workflow to reference multiple Git sources? Specifically, can different tasks within the same workflow point to different Git repositories or types of Git sources?Ty

  • 793 Views
  • 6 replies
  • 1 kudos
Latest Reply
mai_luca
New Contributor III
  • 1 kudos

A workflow can reference multiple Git sources. You can specify the git information for each task. However, I am not sure you can have multiple GitProvider for the same workspace.... 

  • 1 kudos
5 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels