cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Akshatkumar69
by Visitor
  • 31 Views
  • 0 replies
  • 0 kudos

Metric views joins

I am currently working on a migration project from power BI to ai bi dashboard in databricks . Now i am using the metric views to create all the measures and DAX queries which i have in my power BI report in YAML in the metric views but the main prob...

Akshatkumar69_0-1775806455687.png
  • 31 Views
  • 0 replies
  • 0 kudos
subray
by Visitor
  • 53 Views
  • 3 replies
  • 0 kudos

databricks-connect serverless GRPC issue

Queries executed via Databricks Connect v17 (Spark Connect / gRPC) onserverless compute COMPLETE SUCCESSFULLY on the server side (Spark tasksfinish, results are produced), but the Spark Connect gRPC channel FAILSTO DELIVER results back to the client ...

  • 53 Views
  • 3 replies
  • 0 kudos
Latest Reply
anuj_lathi
Databricks Employee
  • 0 kudos

This is a well-known class of issue with gRPC/HTTP2 long-lived streams being killed by network intermediaries. The fact that the Databricks SQL Connector (poll-based HTTP/1.1) works perfectly while Spark Connect (gRPC/HTTP2 streaming) fails is the ke...

  • 0 kudos
2 More Replies
ittzzmalind
by New Contributor II
  • 78 Views
  • 1 replies
  • 0 kudos

Resolved! Accessing Azure Databricks Workspace via Private Endpoint and On-Premises Proxy

Public access to the Azure Databricks workspace is currently disabled. Access is required through a Private Link (private endpoint – api_ui).A private endpoint has already been configured successfully:Virtual Network: Vnet-PE-ENDPOINTSubnet: Snet-PE-...

  • 78 Views
  • 1 replies
  • 0 kudos
Latest Reply
anuj_lathi
Databricks Employee
  • 0 kudos

This is a classic hub-spoke + on-premises hybrid networking scenario. Here's how to architect it end-to-end. Architecture Overview The traffic flow will be: VM (VNet-App) --> ExpressRoute/VPN Gateway --> On-Prem Proxy Server --> ExpressRoute/VPN Gate...

  • 0 kudos
FAHADURREHMAN
by New Contributor III
  • 77 Views
  • 2 replies
  • 1 kudos

Resolved! DELTA Merge taking too much Time

Hi Legends, I have a timeseries DELTA table having 707.1GiB, 7702 files, 262 Billion rows. (Mainly its timeseries data). This table is clustered on 2 columns (Timestamp col & 2nd one is descriptive column)I have designed a pipeline which runs every w...

  • 77 Views
  • 2 replies
  • 1 kudos
Latest Reply
anuj_lathi
Databricks Employee
  • 1 kudos

Great question -- slow MERGE is one of the most common Delta Lake performance issues. Here's a systematic checklist: 1. Partition Pruning in the MERGE Condition The #1 cause of slow MERGEs is missing the partition column in your ON clause. If your ta...

  • 1 kudos
1 More Replies
ChristianRRL
by Honored Contributor
  • 161 Views
  • 5 replies
  • 1 kudos

Get task_run_id that is nested in a job_run task

Hi, I'm wondering if there is an easier way to accomplish this.I can use Dynamic Value reference to pull the run_id of Parent 1 into Parent 2, however, what I'm looking for is for Child 1's task run_id to be referenced within Parent 2.Currently I am ...

  • 161 Views
  • 5 replies
  • 1 kudos
Latest Reply
anuj_lathi
Databricks Employee
  • 1 kudos

Hi @ChristianRRL  you're absolutely right, and I apologize for the earlier suggestion. I've verified that task values from child jobs are not propagated back through run_job tasks. Your instinct about the REST API was correct. Here's the fix: Solutio...

  • 1 kudos
4 More Replies
shan-databricks
by Databricks Partner
  • 121 Views
  • 3 replies
  • 0 kudos

Resolved! Invoking one job from another to execute a specific task

I have multiple tasks, each working with different tables. Each table has dependencies across Bronze, Silver, and Gold layers. I want to trigger and run a specific task independently, instead of running all tasks in the job. How can I do this? Also, ...

  • 121 Views
  • 3 replies
  • 0 kudos
Latest Reply
rohan22sri
New Contributor II
  • 0 kudos

1. Go to job and left click on task you want to run .2. Click on play button(highlighted in yellow in attachment )3. This make sure that you run only 1 task at a time and not the whole job . 

  • 0 kudos
2 More Replies
kevinleindecker
by New Contributor II
  • 269 Views
  • 4 replies
  • 1 kudos

SQL Warehouse error: "Cannot read properties of undefined (reading 'data')" when querying system tab

Queries that previously worked started failing in SQL Warehouse (Dashboards) without any changes on our side.The query succeeds, but fails to render results with error:"Cannot read properties of undefined (reading 'data')"This happens with:- system.b...

  • 269 Views
  • 4 replies
  • 1 kudos
Latest Reply
Esgario
Visitor
  • 1 kudos

Same problem here. I have previously reported this issue, and it had been resolved at the time. However, the problem has now reoccurred.When ingesting large tables (over 100k rows), the system is unable to properly render the data, preventing the tab...

  • 1 kudos
3 More Replies
AanchalSoni
by Databricks Partner
  • 216 Views
  • 7 replies
  • 6 kudos

Resolved! Primary key constraint not working

I've created a Lakeflow job to run 5 notebook tasks, one for each silver table- Customers, Accounts, Transactions, Loans and Branches.In Customers notebook, after writing the data to delta table using auto loader, I'm applying the non null and primar...

  • 216 Views
  • 7 replies
  • 6 kudos
Latest Reply
balajij8
Contributor
  • 6 kudos

@AanchalSoni Capturing the columns as Primary key helps users and tools understand relationships in the data. You can create Primary Key with RELY for optimization in some cases by skipping redundant operations.Distinct EliminationWhen you apply a DI...

  • 6 kudos
6 More Replies
mjedy78
by New Contributor II
  • 2122 Views
  • 4 replies
  • 1 kudos

Transition from partitioned table to Liquid clustered table

Hi all,I have a table called classes, which is already partitioned on three different columns. I want to create a Liquid Clustered Table, but as far as I understand from the documentation—and from Dany Lee and his team—it was not possible as of 2024 ...

  • 2122 Views
  • 4 replies
  • 1 kudos
Latest Reply
biancaorita
New Contributor II
  • 1 kudos

Is there a plan to implement a way to migrate to liquid clustering for an existing table that has traditional partitioning and that is quite large (over 4 TB)? Re-creating such tables from scratch is not always ideal.

  • 1 kudos
3 More Replies
AnandGNR
by New Contributor III
  • 219 Views
  • 7 replies
  • 1 kudos

Unable to create secret scope -"Fetch request failed due expired user session"

Hi everyone,I’m trying to create an Azure Key Vault-backed secret scope in a Databricks Premium workspace, but I keep getting this error: Fetch request failed due expired user sessionSetup details:Databricks workspace: PremiumAzure Key Vault: Owner p...

  • 219 Views
  • 7 replies
  • 1 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 1 kudos

Hi @AnandGNR ,Try to do following. Go to your KeyVault, then in Firewalls and virtual networks set:"Allow trusted Microsoft services to bypass this firewall."

  • 1 kudos
6 More Replies
SuMiT1
by New Contributor III
  • 2336 Views
  • 5 replies
  • 0 kudos

Unable to Create Secret Scope in Databricks – “Fetch request failed due to expired user session”

I’m trying to create an Azure Key Vault-backed Secret Scope in Databricks, but when I click Create, I get this error:Fetch request failed due to expired user sessionI’ve already verified my login, permissions. I also tried refreshing and re-signing i...

  • 2336 Views
  • 5 replies
  • 0 kudos
Latest Reply
AnandGNR
New Contributor III
  • 0 kudos

Hi @SuMiT1 ,Certainly seems to be a networking issue but not able to zero down on what precisely needs to be done. I added the control plane ips to the firewall but still no luck.How do we use  Databricks Access Connector to create scopes. Could you ...

  • 0 kudos
4 More Replies
abhishek0306
by New Contributor
  • 100 Views
  • 3 replies
  • 0 kudos

Databricks file based trigger to sharepoint

Hi,Can we create a file based trigger from sharepoint location for excel files from databricks. So my need is to copy the excel files from sharepoint to external volumes in databricks so can it be done using a trigger that whenever the file drops in ...

  • 100 Views
  • 3 replies
  • 0 kudos
Latest Reply
emma_s
Databricks Employee
  • 0 kudos

HI, You could possibly achieve something near to this using the lakeflow connect Sharepoint connector. It's currently in beta so it would need to be enabled in your workspace. Although it isn't triggered on file updates, because it only ingests incre...

  • 0 kudos
2 More Replies
Brahmareddy
by Esteemed Contributor
  • 145 Views
  • 2 replies
  • 8 kudos

Congratulations to Matei Zaharia - CTO Databricks on the ACM Prize in Computing

When I saw the news that Matei Zaharia received the 2025 ACM Prize in Computing, I felt genuinely happy. It was not just another award announcement. It felt like a proud moment for the whole data engineering community. His work has helped shape the w...

Image 4-8-26 at 9.27 PM.jpeg
  • 145 Views
  • 2 replies
  • 8 kudos
Latest Reply
Advika
Community Manager
  • 8 kudos

@Brahmareddy, what a beautiful tribute! It’s so inspiring to hear how that meeting at the Summit stayed with you.We’re so lucky to have contributors like you who recognize the heart behind the tech. Cheers to Matei and the whole Databricks family!

  • 8 kudos
1 More Replies
IM_01
by Contributor II
  • 142 Views
  • 3 replies
  • 0 kudos

Lakeflow SDP expectations

Hi, Is there a way to get number of warned records, dropped records , failed records for each expectation I see currently it gives aggregated count

  • 142 Views
  • 3 replies
  • 0 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 0 kudos

Hi @IM_01, You can’t change the UI to break out those numbers, but you can get per-expectation counts from the DLT (Lakeflow) event log. Each expectation entry records passed_records and failed_records; for EXPECT rules failed_records = warned rows, ...

  • 0 kudos
2 More Replies
Labels