cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Worrachon
by Visitor
  • 18 Views
  • 0 replies
  • 0 kudos

Data bricks Connot run pipeline

 found that when I run the pipeline, it shows the message "'Cannot run pipeline', 'PL_TRNF_CRM_SALESFORCE_TO_BLOB', "HTTPSConnectionPool(host='management.azure.com', port=443) It doesn't happen on every instance, but I encounter this case often. 

Worrachon_3-1757025996750.png
  • 18 Views
  • 0 replies
  • 0 kudos
kmodelew
by New Contributor III
  • 315 Views
  • 9 replies
  • 15 kudos

Unable to read excel file from Volume

Hi, I'am trying to read excel file directly from Volume (not workspace or filestore) -> all examples on the internet use workspace or filestore. Volume is external location so I can read from there but I would like to read directly from Volume. I hav...

  • 315 Views
  • 9 replies
  • 15 kudos
Latest Reply
ck7007
New Contributor II
  • 15 kudos

@szymon_dybczak @BS_THE_ANALYST @TheOC:The Actual Working SolutionThe pandas approach works (as @kmodelew confirmed), but here's the complete, tested solution for reading Excel from Volumes:# For Excel files in Unity Catalog Volumesimport pandas as p...

  • 15 kudos
8 More Replies
ck7007
by New Contributor II
  • 80 Views
  • 4 replies
  • 2 kudos

Streaming Solution

Maintain Zonemaps with Streaming Writes Challenge: Streaming breaks zonemaps due to constant micro-batches.Solution: Incremental Updatesdef write_streaming_with_zonemap(stream_df, table_path):def update_zonemap(batch_df, batch_id):# Write databatch_d...

  • 80 Views
  • 4 replies
  • 2 kudos
Latest Reply
ManojkMohan
Contributor III
  • 2 kudos

@ck7007 brainstormed some solution approaches ., do you have some test data to test these hands on  Approach                            Throughput Query Speed Complexity NotesPartition-level zonemapsHighMediumLowScales with micro-batches; prune at pa...

  • 2 kudos
3 More Replies
noorbasha534
by Valued Contributor II
  • 37 Views
  • 3 replies
  • 0 kudos

Cost attribution based on table history statistics

Hello all,I have a job that processes 50 tables - 25 belong to finance, 20 belong to master data, 5 belong to supply chain data domains.Now, imagine the job ran for 14 hours and did cost me 1000 euros on a day. If I like to attribute the per day cost...

  • 37 Views
  • 3 replies
  • 0 kudos
Latest Reply
ManojkMohan
Contributor III
  • 0 kudos

Root Cause / Why executionTimeMs isn’t idealexecutionTimeMs includes everything the job did:Waiting for resourcesShuffle, GC, or network latencyContention with other concurrent jobsUsing this to allocate costs can misattribute costs, especially if so...

  • 0 kudos
2 More Replies
ManojkMohan
by Contributor III
  • 277 Views
  • 15 replies
  • 13 kudos

Ingesting 100 TB raw CSV data into the Bronze layer in Parquet + Snappy

Problem i am trying to solve:Bronze is the landing zone for immutable, raw data.At this stage, i am trying to sse a columnar format (Parquet or ORC) → good compression, efficient scans. and then apply lightweight compression (e.g., Snappy) → balances...

  • 277 Views
  • 15 replies
  • 13 kudos
Latest Reply
ManojkMohan
Contributor III
  • 13 kudos

@szymon_dybczak @BS_THE_ANALYST @Coffee77 @TheOC  the use case summary is as eblow The use case: A telecom operator wants to minimize unnecessary truck rolls (sending technicians to customer sites), which cost $100–$200 per visit.Data sources feeding...

  • 13 kudos
14 More Replies
dbdev
by New Contributor II
  • 610 Views
  • 10 replies
  • 4 kudos

Maven libraries in VNet injected, UC enabled workspace on Standard Access Mode Cluster

Hi!As the title suggests, I want to install Maven libaries on my cluster with access mode 'Standard'. Our workspace is VNet injected and has Unity Catalog enabled.The coordinates have been allowlisted by the account team according to these instructio...

dbdev_1-1756137297433.png dbdev_2-1756137354610.png dbdev_3-1756137433510.png
  • 610 Views
  • 10 replies
  • 4 kudos
Latest Reply
dbdev
New Contributor II
  • 4 kudos

We have resolved the Metastore issue, which also seemed to have resolved the JAR issue. I don't have a clue why this resolves it. The network people might have used service tags which also opened the workspace to the odbc connections?

  • 4 kudos
9 More Replies
seefoods
by Contributor II
  • 76 Views
  • 4 replies
  • 1 kudos

read json files on unity catalog

Hello Guys,  I have some issue when i load several json files which have a same schema on databricks. when i do2025_07_17_19_55_00_2025_07_31_21_55_00_17Q51D_alice_out.json 516.13 KB2025_07_17_19_55_00_2025_07_31_21_55_00_17Q51D_bob_out.json 516.13 K...

  • 76 Views
  • 4 replies
  • 1 kudos
Latest Reply
seefoods
Contributor II
  • 1 kudos

Hello @szymon_dybczak , Its Ok i have check the history of the table. I'm so confuse about the command display() output and the really output write operationThanx

  • 1 kudos
3 More Replies
j_unspeakable
by New Contributor III
  • 738 Views
  • 2 replies
  • 2 kudos

Resolved! Permission Denied when Creating External Tables Using Workspace Default Credential

I’m building out schemas, volumes, and external Delta tables in Unity Catalog via Terraform. The schemas and volumes are created successfully, but all external tables are failing.The error message from Terraform doesn't highlight what the issue is bu...

image.png image.png Screenshot 2025-06-15 152848.png
  • 738 Views
  • 2 replies
  • 2 kudos
Latest Reply
IanB
Visitor
  • 2 kudos

I had the same issue, thank you my dear

  • 2 kudos
1 More Replies
victorNilsson
by New Contributor II
  • 108 Views
  • 3 replies
  • 2 kudos

Read polars from recently created csv file

More and more python packages transition to use polars instead of e.g. pandas. There is a problem with this in databricks when trying to read a csv file with it using pl.read_csv("filename.csv") when the file has been created in the same notebook cel...

victorNilsson_1-1756734657452.png
Data Engineering
csv
file system
OSError
polars
  • 108 Views
  • 3 replies
  • 2 kudos
Latest Reply
Pilsner
Contributor
  • 2 kudos

Hello @victorNilsson Thank you for letting me know how to replicate the issue, I was able to get the same error this time. I've given the problem another go and think I have been able to fix it by specifying the output path as "/tmp/test.csv". By wri...

  • 2 kudos
2 More Replies
stucas
by New Contributor
  • 64 Views
  • 1 replies
  • 0 kudos

DLT Pipeline and Pivot tables

TLDR:Can DLT determine a dynamic schema - one which is generated from the results of a pivot?IssueI know you cant use spark `.pivot` in DLT pipeline and that if you wish to pivot data you need to do that outside of the DLT decorated functions. I have...

  • 64 Views
  • 1 replies
  • 0 kudos
Latest Reply
SP_6721
Contributor III
  • 0 kudos

Hi @stucas ,Adding following configuration spark.databricks.delta.schema.autoMerge.enabled = trueto the DLT pipeline will allow new pivoted columns to be merged into the target table automatically. However, DLT still requires a defined schema at init...

  • 0 kudos
akdav
by New Contributor II
  • 491 Views
  • 13 replies
  • 6 kudos

Resolved! Job File Event Trigger not firing for SftpCommit and SftpCreate

Hi there, We are using Azure Storage Account and their SFTP feature. We have 3rd parties we work with that submit reports to us via SFTP into Azure Blob Storage. We have setup a File Trigger for that external location. Everything works fine if you up...

  • 491 Views
  • 13 replies
  • 6 kudos
Latest Reply
akdav
New Contributor II
  • 6 kudos

Hi Dimitry, You need to go to the external_location. Then turn off file events for that external_location.Then you still select File Trigger. It will then evaluate the external_location. It will give you a message that you can only track up to 10k Fi...

  • 6 kudos
12 More Replies
shubham007
by New Contributor III
  • 156 Views
  • 9 replies
  • 2 kudos

Databricks Lakebridge: Azure SQL DB to Databricks (Error while import)

Hi community experts,I am getting error "cannot import name 'recon' from 'databricks.labs.lakebridge.reconcile.execute'" importing modules as shown in attached screenshot. I am follwing steps as mentioned in your partner training module "Lakebridge f...

error_recon.png
  • 156 Views
  • 9 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @shubham007 ,They made refactoring to that module in last month so that's why it stopped working. Probably Lakebridge for SQL Source System Migration module was recorded before that change.And why they made change? It is explained here:Split recon...

  • 2 kudos
8 More Replies
Dimitry
by Contributor III
  • 170 Views
  • 11 replies
  • 3 kudos

Resolved! Unreliable file events on Azure Storage (SFTP) for job trigger

Hi allI got a job trigger by a file event on the external location.The location and jobs triggers are working fine when uploading file via Azure Portal.I need SFTP trigger, so I went into the event grid, found subscription for the storage account on ...

Dimitry_2-1756857231122.png Dimitry_1-1756857151591.png
  • 170 Views
  • 11 replies
  • 3 kudos
Latest Reply
Dimitry
Contributor III
  • 3 kudos

UpdateAppears that even uploading via UI does not trigger it any more. It did trigger weeks ago.I have just uploaded a file in UI and saw this message in the storage queue:{"topic":"/subscriptions/xxx/resourceGroups/xxx/providers/Microsoft.Storage/st...

  • 3 kudos
10 More Replies
shubham007
by New Contributor III
  • 218 Views
  • 1 replies
  • 0 kudos

Databricks Lakebridge: Azure SQL DB to Databricks (Error in Data and Schema Validation)

Hi community experts,I am getting error while Data and Schema Validation with the Reconciler. As attached here screenshots. Please help resolve this issue.Output:    

shubham007_0-1756969442961.png shubham007_1-1756969493574.png shubham007_0-1756969649236.png shubham007_1-1756969678992.png
  • 218 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @shubham007 ,As stated in another thread. I think this error could be related to misconfiguration on your side. Lakebridge is trying to find following table in your SQL Server instance -> None.SalesLT.customerBut look at which database reconciliat...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels