cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

FAHADURREHMAN
by New Contributor II
  • 30 Views
  • 0 replies
  • 1 kudos

DELTA Merge taking too much Time

Hi Legends, I have a timeseries DELTA table having 707.1GiB, 7702 files, 262 Billion rows. (Mainly its timeseries data). This table is clustered on 2 columns (Timestamp col & 2nd one is descriptive column)I have designed a pipeline which runs every w...

  • 30 Views
  • 0 replies
  • 1 kudos
ChristianRRL
by Honored Contributor
  • 153 Views
  • 5 replies
  • 1 kudos

Get task_run_id that is nested in a job_run task

Hi, I'm wondering if there is an easier way to accomplish this.I can use Dynamic Value reference to pull the run_id of Parent 1 into Parent 2, however, what I'm looking for is for Child 1's task run_id to be referenced within Parent 2.Currently I am ...

  • 153 Views
  • 5 replies
  • 1 kudos
Latest Reply
anuj_lathi
Databricks Employee
  • 1 kudos

Hi @ChristianRRL  you're absolutely right, and I apologize for the earlier suggestion. I've verified that task values from child jobs are not propagated back through run_job tasks. Your instinct about the REST API was correct. Here's the fix: Solutio...

  • 1 kudos
4 More Replies
shan-databricks
by Databricks Partner
  • 85 Views
  • 3 replies
  • 0 kudos

Invoking one job from another to execute a specific task

I have multiple tasks, each working with different tables. Each table has dependencies across Bronze, Silver, and Gold layers. I want to trigger and run a specific task independently, instead of running all tasks in the job. How can I do this? Also, ...

  • 85 Views
  • 3 replies
  • 0 kudos
Latest Reply
rohan22sri
New Contributor II
  • 0 kudos

1. Go to job and left click on task you want to run .2. Click on play button(highlighted in yellow in attachment )3. This make sure that you run only 1 task at a time and not the whole job . 

  • 0 kudos
2 More Replies
kevinleindecker
by New Contributor II
  • 260 Views
  • 4 replies
  • 1 kudos

SQL Warehouse error: "Cannot read properties of undefined (reading 'data')" when querying system tab

Queries that previously worked started failing in SQL Warehouse (Dashboards) without any changes on our side.The query succeeds, but fails to render results with error:"Cannot read properties of undefined (reading 'data')"This happens with:- system.b...

  • 260 Views
  • 4 replies
  • 1 kudos
Latest Reply
Esgario
Visitor
  • 1 kudos

Same problem here. I have previously reported this issue, and it had been resolved at the time. However, the problem has now reoccurred.When ingesting large tables (over 100k rows), the system is unable to properly render the data, preventing the tab...

  • 1 kudos
3 More Replies
AanchalSoni
by Databricks Partner
  • 153 Views
  • 7 replies
  • 6 kudos

Resolved! Primary key constraint not working

I've created a Lakeflow job to run 5 notebook tasks, one for each silver table- Customers, Accounts, Transactions, Loans and Branches.In Customers notebook, after writing the data to delta table using auto loader, I'm applying the non null and primar...

  • 153 Views
  • 7 replies
  • 6 kudos
Latest Reply
balajij8
Contributor
  • 6 kudos

@AanchalSoni Capturing the columns as Primary key helps users and tools understand relationships in the data. You can create Primary Key with RELY for optimization in some cases by skipping redundant operations.Distinct EliminationWhen you apply a DI...

  • 6 kudos
6 More Replies
mjedy78
by New Contributor II
  • 2118 Views
  • 4 replies
  • 1 kudos

Transition from partitioned table to Liquid clustered table

Hi all,I have a table called classes, which is already partitioned on three different columns. I want to create a Liquid Clustered Table, but as far as I understand from the documentation—and from Dany Lee and his team—it was not possible as of 2024 ...

  • 2118 Views
  • 4 replies
  • 1 kudos
Latest Reply
biancaorita
New Contributor II
  • 1 kudos

Is there a plan to implement a way to migrate to liquid clustering for an existing table that has traditional partitioning and that is quite large (over 4 TB)? Re-creating such tables from scratch is not always ideal.

  • 1 kudos
3 More Replies
AnandGNR
by New Contributor III
  • 203 Views
  • 7 replies
  • 1 kudos

Unable to create secret scope -"Fetch request failed due expired user session"

Hi everyone,I’m trying to create an Azure Key Vault-backed secret scope in a Databricks Premium workspace, but I keep getting this error: Fetch request failed due expired user sessionSetup details:Databricks workspace: PremiumAzure Key Vault: Owner p...

  • 203 Views
  • 7 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @AnandGNR ,Try to do following. Go to your KeyVault, then in Firewalls and virtual networks set:"Allow trusted Microsoft services to bypass this firewall."

  • 1 kudos
6 More Replies
SuMiT1
by New Contributor III
  • 2328 Views
  • 5 replies
  • 0 kudos

Unable to Create Secret Scope in Databricks – “Fetch request failed due to expired user session”

I’m trying to create an Azure Key Vault-backed Secret Scope in Databricks, but when I click Create, I get this error:Fetch request failed due to expired user sessionI’ve already verified my login, permissions. I also tried refreshing and re-signing i...

  • 2328 Views
  • 5 replies
  • 0 kudos
Latest Reply
AnandGNR
New Contributor III
  • 0 kudos

Hi @SuMiT1 ,Certainly seems to be a networking issue but not able to zero down on what precisely needs to be done. I added the control plane ips to the firewall but still no luck.How do we use  Databricks Access Connector to create scopes. Could you ...

  • 0 kudos
4 More Replies
abhishek0306
by New Contributor
  • 92 Views
  • 3 replies
  • 0 kudos

Databricks file based trigger to sharepoint

Hi,Can we create a file based trigger from sharepoint location for excel files from databricks. So my need is to copy the excel files from sharepoint to external volumes in databricks so can it be done using a trigger that whenever the file drops in ...

  • 92 Views
  • 3 replies
  • 0 kudos
Latest Reply
emma_s
Databricks Employee
  • 0 kudos

HI, You could possibly achieve something near to this using the lakeflow connect Sharepoint connector. It's currently in beta so it would need to be enabled in your workspace. Although it isn't triggered on file updates, because it only ingests incre...

  • 0 kudos
2 More Replies
Brahmareddy
by Esteemed Contributor
  • 123 Views
  • 2 replies
  • 8 kudos

Congratulations to Matei Zaharia - CTO Databricks on the ACM Prize in Computing

When I saw the news that Matei Zaharia received the 2025 ACM Prize in Computing, I felt genuinely happy. It was not just another award announcement. It felt like a proud moment for the whole data engineering community. His work has helped shape the w...

Image 4-8-26 at 9.27 PM.jpeg
  • 123 Views
  • 2 replies
  • 8 kudos
Latest Reply
Advika
Community Manager
  • 8 kudos

@Brahmareddy, what a beautiful tribute! It’s so inspiring to hear how that meeting at the Summit stayed with you.We’re so lucky to have contributors like you who recognize the heart behind the tech. Cheers to Matei and the whole Databricks family!

  • 8 kudos
1 More Replies
IM_01
by Contributor II
  • 131 Views
  • 3 replies
  • 0 kudos

Lakeflow SDP expectations

Hi, Is there a way to get number of warned records, dropped records , failed records for each expectation I see currently it gives aggregated count

  • 131 Views
  • 3 replies
  • 0 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 0 kudos

Hi @IM_01, You can’t change the UI to break out those numbers, but you can get per-expectation counts from the DLT (Lakeflow) event log. Each expectation entry records passed_records and failed_records; for EXPECT rules failed_records = warned rows, ...

  • 0 kudos
2 More Replies
romquesta
by New Contributor
  • 56 Views
  • 1 replies
  • 1 kudos

Why Data Privacy Matters More Than Ever in the Digital Age

In today’s hyper-connected world, Data Privacy has become a critical concern for individuals and businesses alike. Every time we browse a website, use an app, or make an online purchase, we leave behind a trail of personal information. This data can ...

  • 56 Views
  • 1 replies
  • 1 kudos
Latest Reply
Sumit_7
Honored Contributor II
  • 1 kudos

Totally agreed there @romquesta, really nice summarization!Did you check the Project Glasswing?

  • 1 kudos
Phani1
by Databricks MVP
  • 253 Views
  • 6 replies
  • 4 kudos

Best Practices for Implementing Automated, Scalable, and Auditable Purge Mechanism on Azure Databric

 Hi All, I'm looking to implement an automated, scalable, and auditable purge mechanism on Azure Databricks to manage data retention, deletion and archival policies across our Unity Catalog-governed Delta tables.I've come across various approaches, s...

  • 253 Views
  • 6 replies
  • 4 kudos
Latest Reply
AbhaySingh
Databricks Employee
  • 4 kudos

Here is my action plan if it helps! Phase 1: Foundation ☐ Migrate to UC managed tables (if not already) ☐ Enable Predictive Optimization at catalog level ☐ Set delta.deletedFileRetentionDuration per layer Phase 2: Retention Policies ☐ Enab...

  • 4 kudos
5 More Replies
Labels