cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

fernandomendi
by New Contributor II
  • 2528 Views
  • 2 replies
  • 0 kudos

Row IDs for DLTs

Hi all,I have a DLT pipeline where I am reading from a cloud source and want to mode data through some tables onto a final Gold layer table. I would like to use SQL to write my DLTs. I would also like to have a row_id for each row to identify each in...

Data Engineering
dlt
identity
row_id
  • 2528 Views
  • 2 replies
  • 0 kudos
Latest Reply
SP_6721
Honored Contributor
  • 0 kudos

Hi @fernandomendi In Delta Live Tables (DLT), if you want to assign a unique identifier to each row, enabling delta.enableRowTracking and selecting _metadata.row_id directly in your SQL query is a valid approach, just be sure to include it explicitly...

  • 0 kudos
1 More Replies
Thanapat_S
by Contributor
  • 29786 Views
  • 9 replies
  • 5 kudos

Resolved! Can I change from default showing first 1,000 to return all records when query?

I have to query a data for showing in my dashboard.But it truncated the results and showing only first 1,000 rows.In the dashboard view, there is no option to re-execute with maximum result limits.I don't want to switch back to standard view and clic...

image image.png
  • 29786 Views
  • 9 replies
  • 5 kudos
Latest Reply
jngnyc
New Contributor II
  • 5 kudos

I found this explanation helpful: 

  • 5 kudos
8 More Replies
miki1999
by New Contributor
  • 2078 Views
  • 2 replies
  • 0 kudos

Problem connecting vscode with databricks

I have a problem connecting VSCode with databricks. I am doing all the steps to add databricks in VSCode but after the last step I get this error in VSCode:“Error connecting to the workspace: "Can't set configuration 'authProfile' without selecting a...

  • 2078 Views
  • 2 replies
  • 0 kudos
Latest Reply
inpappas
New Contributor II
  • 0 kudos

I faced the same problem today. The case was that the databricks.yml file inside my bundle did not include a targets mapping. I created one like so  then added the relevant profile in the .databrickscfg fileand it got detected by the extension Hope i...

  • 0 kudos
1 More Replies
Yuki
by Contributor
  • 860 Views
  • 2 replies
  • 1 kudos

ModuleNotFoundError or Permission denied are occurring when running Job with Git and All purpose

"ModuleNotFoundError" or "Permission denied" are occurring when running Jobs with Git provider and All purpose clusters.Which one occurs depends on the workspace, but it is reproducible.Is that no longer recommended?I can run it by serverless cluster...

Yuki_1-1748354429378.png Yuki_2-1748354836276.png
  • 860 Views
  • 2 replies
  • 1 kudos
Latest Reply
Yuki
Contributor
  • 1 kudos

Hi @Renu_ ,Thank you so much for your advice. I will try to both patterns.Regards,

  • 1 kudos
1 More Replies
gsouza
by New Contributor II
  • 2987 Views
  • 3 replies
  • 3 kudos

Databricks asset bundle occasionally duplicating jobs

Since last year, we have adopted Databricks Asset Bundles for deploying our workflows to the production and staging environments. The tool has proven to be quite effective, and we currently use Azure DevOps Pipelines to automate bundle deployment, tr...

gsouza_0-1743021507944.png
  • 2987 Views
  • 3 replies
  • 3 kudos
Latest Reply
kevin_w_edwards
New Contributor II
  • 3 kudos

This is negatively impacting us as well. 

  • 3 kudos
2 More Replies
der
by Contributor II
  • 1084 Views
  • 2 replies
  • 2 kudos

Resolved! Permission denied on shallow cloned table write on single cluster

If I want to modify a shallow cloned table with partitionOverwriteMode dynamic on a "dedicated/single user" cluster DBR 16.4 i get following error message: Py4JJavaError: An error occurred while calling o483.saveAsTable.: org.apache.spark.SparkExcept...

  • 1084 Views
  • 2 replies
  • 2 kudos
Latest Reply
der
Contributor II
  • 2 kudos

@Isi Thank you for the link to the documentation. I did not find it!

  • 2 kudos
1 More Replies
TamD
by Contributor
  • 2887 Views
  • 8 replies
  • 1 kudos

Cannot apply liquid clustering via DLT pipeline

I want to use liquid clustering on a materialised view created via a DLT pipeline, however, there doesn't appear to be a valid way to do this.Via table properties:@Dlt.table( name="<table name>, comment="<table description", table_propert...

  • 2887 Views
  • 8 replies
  • 1 kudos
Latest Reply
Anand13
New Contributor II
  • 1 kudos

Hi everyone, in our project we are trying to implement liquid clustering. We are testing liquid clustering with a test table called status_update, where we need to update the status for different market IDs. We are trying to update the status_update ...

  • 1 kudos
7 More Replies
Tchalim
by New Contributor II
  • 939 Views
  • 2 replies
  • 2 kudos

Resolved! Actively Seeking Data Engineering Opportunities – Impact-Driven &amp; Committed Profile

Hello everyone,My name is Tchalim M'Bandakpa, a passionate Data Engineer based in West Africa (Lomé, Togo), with a strong interest in distributed systems, large-scale data processing performance, and modern architectures such as the Lakehouse paradig...

  • 939 Views
  • 2 replies
  • 2 kudos
Latest Reply
Advika
Databricks Employee
  • 2 kudos

Hello @Tchalim! It’s great to have such a passionate and skilled Data Engineer join the Community . Your background and technical strengths are highly valuable here.I encourage you to engage, ask questions, and share your insights. If you're looking ...

  • 2 kudos
1 More Replies
ble
by New Contributor III
  • 1324 Views
  • 3 replies
  • 0 kudos

Resolved! Databricks Salesforce Connector - ActivityMetric error

Hi all,I'm experiencing an issue starting 13th May 2025 where a previously successful pipeline using the salesforce connector is now failing, complaining that "Object ActivityMetric is not supported by the Salesforce connector", despite no change to ...

  • 1324 Views
  • 3 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @ble! This was a known issue that the engineering team had been investigating. Could you kindly confirm whether you’re still experiencing the issue or if it has been resolved?

  • 0 kudos
2 More Replies
kumar_soneta
by New Contributor
  • 1320 Views
  • 1 replies
  • 1 kudos

Autoloader move file to archive immediately after processing

Hi,We are using autoloader with spark streaming (Databricks: file detection mode) and Want to move files to archive folder from source immediately after processing file. But I cannot reduce retention window beyond 7 days. Code:.option("cloudFiles.cle...

  • 1320 Views
  • 1 replies
  • 1 kudos
Latest Reply
vaibhavs120
Contributor
  • 1 kudos

cloudFiles.cleanSource.retentionDurationType: Interval StringAmount of time to wait before processed files become candidates for archival with cleanSource. Must be greater than 7 days for DELETE. No minimum restriction for MOVE.Available in Databrick...

  • 1 kudos
NicolaCompton
by New Contributor II
  • 2797 Views
  • 5 replies
  • 0 kudos

Error Unable to Register Model to Unity Catalog

I am following the examples outlined here:  https://learn.microsoft.com/en-us/azure/databricks/machine-learning/manage-model-lifecycle/ to register a model to unity catalog.I keep getting this error : BlockingIOError: [Errno 11] Resource temporarily ...

NicolaCompton_0-1748352021736.png
  • 2797 Views
  • 5 replies
  • 0 kudos
Latest Reply
NicolaCompton
New Contributor II
  • 0 kudos

Thank you very much for your response and explanation.Unfortunately, this takes me back to the original error. I have tried changing the set_tracking_uri to my workspace url and I get the same error.Any ideas what this could be?

  • 0 kudos
4 More Replies
Datagyan
by New Contributor II
  • 700 Views
  • 1 replies
  • 0 kudos

Downloading the query result through rest API

Hi all i have a specific requirements to download the query result. i have created a table on data bricks using SQL warehouse. I have to fetch the query from a custom UI using data API token. Now I am able to fetch the query, but the problem is what ...

  • 700 Views
  • 1 replies
  • 0 kudos
Latest Reply
HariSankar
Contributor III
  • 0 kudos

Hey @Datagyan ,If your query result is larger than 25MB, Databricks automatically uses disposition=EXTERNAL_LINKS, which returns the result in multiple chunked files(external links).Currently, there's no option to get a single file directly from the ...

  • 0 kudos
juan_barreto
by New Contributor III
  • 827 Views
  • 1 replies
  • 0 kudos

Service Principal cannot access its own workspace folder

We are using Asset bundles with databricks runtime 14.3LTS. During DAB deployment, the wheel is built and stored in the folder of the service principal running the deployment via GH workflow. The full path is/Workspace/Users/SERVICE-PRINCIPAL-ID/.bun...

  • 827 Views
  • 1 replies
  • 0 kudos
Latest Reply
HariSankar
Contributor III
  • 0 kudos

You're encountering a common issue when using service principals and job clusters with workspace-scoped paths. This typically happens due topermission mismatches or cluster identity issues. Here’s a breakdown of the root cause and a recommended solut...

  • 0 kudos
Nirupam
by New Contributor III
  • 2329 Views
  • 1 replies
  • 2 kudos

Resolved! Access Mode: Dedicated (assigned to a group) VS Standard

Dedicated Access mode on Azure Databricks clusters provides the option to give access to a GROUP.Trying to understand the use casewhen compared to Standard (formerly: Shared)?When compared to Dedicated (access given to single user)?Ignoring - Languag...

  • 2329 Views
  • 1 replies
  • 2 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 2 kudos

Dedicated Access mode on Azure Databricks clusters is an upgraded feature that extends the capabilities of single-user access mode. This mode allows a compute resource to be assigned either to a single user or to a group. It offers secure sharing amo...

  • 2 kudos
Pat
by Esteemed Contributor
  • 1148 Views
  • 1 replies
  • 0 kudos

Spark custom data sources - SQS streaming reader [DLT]

Hey,I’m working on pulling data from AWS SQS into Databricks using Spark custom data sources and DLT (see https://docs.databricks.com/aws/en/pyspark/datasources). I started with a batch reader/writer based on this example: https://medium.com/@zcking/...

  • 1148 Views
  • 1 replies
  • 0 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 0 kudos

For your consideration: To address the challenge of passing message handles from executors back to the driver within the DataSourceStreamReader, consider the following approaches: Challenges in Spark Architecture 1. Executor Memory Isolation: Execut...

  • 0 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels