cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Pratikmsbsvm
by Contributor
  • 338 Views
  • 1 replies
  • 1 kudos

Data Transfer Between 2 Databricks Instance without using Delta share.

Hello,I am not allow to use Delta share.What could be best approach to send Data from Databricks A to Databricks B. AS Shown in diagram.what mechanism we can used to transfer the data. for example:- Do I need to open port or any other mechanism like ...

Pratikmsbsvm_0-1754553518217.png
  • 338 Views
  • 1 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

If the two workspaces are in the same region (same control plane), then there is no need to transfer data.  Remember, Databricks does not store your data or put it into a proprietary format.  You give Databricks the permissions to access the data tha...

  • 1 kudos
prakashhinduja1
by New Contributor
  • 748 Views
  • 2 replies
  • 1 kudos

Resolved! Prakash Hinduja Switzerland (Swiss) How do I build an ETL pipeline in Databricks?

Hi I’m Prakash Hinduja, a visionary financial strategist, was born in Amritsar (India) and now resides in Geneva, Switzerland (Swiss). I’m looking to build an ETL pipeline in Databricks and would love some guidance. What are the key steps I should fo...

  • 748 Views
  • 2 replies
  • 1 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor III
  • 1 kudos

@prakashhinduja1 this is a great resource if you wanna get stuck straight in to an example: https://community.databricks.com/t5/get-started-guides/getting-started-with-databricks-build-a-simple-lakehouse/ta-p/67404All the best,BS

  • 1 kudos
1 More Replies
kamalhinduja
by New Contributor
  • 528 Views
  • 1 replies
  • 0 kudos

Resolved! Kamal Hinduja Switzerland (Swiss) What is the best way to manage Delta Lake tables in Databricks?

Hi,I'm Kamal Hinduja. I was born in Chennai, India, and I now reside in Geneva, Switzerland(Swiss) . Can anyone explain in detail what is the best way to manage Delta Lake tables in Databricks?Thanks, RegardsKamal Hinduja Geneva, Switzerland(Swiss)

  • 528 Views
  • 1 replies
  • 0 kudos
Latest Reply
BS_THE_ANALYST
Esteemed Contributor III
  • 0 kudos

Hi @kamalhinduja, There's a great article here: https://docs.databricks.com/aws/en/delta/best-practices If you look down the left-hand side navigation pane on the link above, you'll find a bunch of useful articles surrounding the Delta Lake and Delta...

  • 0 kudos
GC-James
by Contributor II
  • 367 Views
  • 2 replies
  • 1 kudos

Resolved! Migrating to new SQL parameters

How do I migrate this to the new SQL paramaters?  %sqlCREATE OR REPLACE TABLE ${environment_name}.${schema_name}.cmip6_max_rainfall_${run_version} ASSELECT *FROM read_files('/Volumes/${environment_name}/${schema_name}/pluvial_flood/scratch/gfes_parqu...

  • 367 Views
  • 2 replies
  • 1 kudos
Latest Reply
GC-James
Contributor II
  • 1 kudos

Thanks for the help on how to change. I must say it seemed better how it was before!

  • 1 kudos
1 More Replies
samaiyanik
by New Contributor
  • 248 Views
  • 1 replies
  • 0 kudos

Resolved! Databricks Free Edition | RETRIES_EXCEEDED issue

Hi Team,I am not able to fire below commands i am getting error %sqlCREATE SCHEMA IF NOT EXISTS workspace.gold; The maximum number of retries has been exceeded.Tried all the available option but not working ThanksNikhil Samaiya

  • 248 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @samaiyanik! Could you please try the suggestions shared in the post below and let us know if that helps resolve the issue?Similar Post: error: [RETRIES_EXCEEDED] The maximum number of retries has been exceeded

  • 0 kudos
Subha0920
by New Contributor II
  • 679 Views
  • 3 replies
  • 1 kudos

Databricks recommended Approach to load data vault 2.0

Hi,Please share the recommended approach to load Data Vault 2.0 .Overview1. Current Landscape -  Lakehouse (Bronze/Silver/Gold)2. Data Vault 2.0 to be created in Silver layer.3. Bronze data will be made available in delta table using ETL Questions1. ...

  • 679 Views
  • 3 replies
  • 1 kudos
Latest Reply
Subha0920
New Contributor II
  • 1 kudos

Kindly provide your valuable input and suggestion for the above questions

  • 1 kudos
2 More Replies
camilo_s
by Contributor
  • 3347 Views
  • 5 replies
  • 0 kudos

Spark SQL vs serverless SQL

Are there any benchmarks showing performance and cost differences between running SQL workloads on Spark SQL vs Databricks SQL (specially serverless SQL)?Our customer is hesitant about getting locked into Databricks SQL as opposed to being able to ru...

  • 3347 Views
  • 5 replies
  • 0 kudos
Latest Reply
maxwarior
New Contributor II
  • 0 kudos

Spark SQL serves as the SQL interface for Spark applications, whereas Databricks SQL is a more advanced, warehouse-optimized product built around SQL Warehouses, which utilize multiple Spark clusters. This architectural difference can lead to noticea...

  • 0 kudos
4 More Replies
habyphilipose
by New Contributor II
  • 388 Views
  • 3 replies
  • 4 kudos

DLT table deletion

If we delete the DLT pipeline, the tables would get deleted.But in a DLT pipeline which creates 5 tables , if I comment out logic of 1 table, that table is not deleted from the catalog, even though full refresh of the pipeline is done.Does anyone kno...

  • 388 Views
  • 3 replies
  • 4 kudos
Latest Reply
MartinIsti
New Contributor III
  • 4 kudos

Don't confuse DLT and LDP (Lakeflow Declarative Pipelines) as though behind the scenes they work very similarly, the UI and the developer experience has changed immensely and very important new features have been added. I used DLT extensively and in ...

  • 4 kudos
2 More Replies
ChristianRRL
by Valued Contributor III
  • 346 Views
  • 1 replies
  • 0 kudos

Troubleshooting AutoLoader

Hi there, I am running into a bit of an issue displaying some AutoLoader readStream data. Can I get some assistance to understand how to properly troubleshoot this? I've looked at logs before, but frankly it's not clear where to look exactly:First, "...

ChristianRRL_2-1754495673026.png ChristianRRL_1-1754495653701.png
  • 346 Views
  • 1 replies
  • 0 kudos
Latest Reply
MartinIsti
New Contributor III
  • 0 kudos

I'm also working with AutoLoader these days to create an ingestion pattern and troubleshooting it can be tricky.I wonder if you could pick a single file (whose full path / location / uri you exactly know) and read it without autoloader. Just with spa...

  • 0 kudos
ManojkMohan
by Honored Contributor
  • 291 Views
  • 1 replies
  • 2 kudos

Resolved! Sample Data Reflecting but Uploaded File reflecting

Step1: I uploaded CSV file manually in data bricks  Step 2: Connector created and active between Salesforce and DatabricksStep 3: Creating Data Streams in Salesforce Data CloudSample Topics are reflecting , matching between what i see in data bricks ...

ManojkMohan_0-1754495848903.png ManojkMohan_1-1754495874477.png ManojkMohan_2-1754495949156.png ManojkMohan_3-1754495963763.png
  • 291 Views
  • 1 replies
  • 2 kudos
Latest Reply
ManojkMohan
Honored Contributor
  • 2 kudos

I resolved it myselfStep1: workspace --> manage permissions step 2: chose all permissionsstep 3: went to raw uploaded file and share via delta sharingStep4: in salesforce data stream i got the raw file 

  • 2 kudos
Shruti12
by New Contributor II
  • 665 Views
  • 2 replies
  • 1 kudos

Databricks support updating multiple target rows with single matching source row in merge query?

Hi,I am getting this error in merge statement. DeltaUnsupportedOperationException: Cannot perform Merge as multiple source rows matched and attempted to modify the same target row in the Delta table in possibly conflicting ways.Does Databricks suppor...

  • 665 Views
  • 2 replies
  • 1 kudos
Latest Reply
Shruti12
New Contributor II
  • 1 kudos

Hi @szymon_dybczak ,Thanks for your reply. The above code is working fine which means multiple updates can be done from a single source target. So, it may be when there are complex matching conditions/values, merge query gives error.I cannot send you...

  • 1 kudos
1 More Replies
arsamkull
by New Contributor III
  • 6670 Views
  • 6 replies
  • 6 kudos

Usage of Azure DevOps System.AccessToken as PAT in Databricks

Hi there! I'm trying to use Azure DevOps Pipeline to automate Azure Databricks Repos API. Im using the following workflow:Get an Access Token for a Databricks Service Principal using a Certificate (which works great)Usage REST Api to generate Git Cre...

  • 6670 Views
  • 6 replies
  • 6 kudos
Latest Reply
Srihasa_Akepati
Databricks Employee
  • 6 kudos

@Adrian Ehrsam​ The PAT limit has been increased to 2048 now. Please check.

  • 6 kudos
5 More Replies
ManojkMohan
by Honored Contributor
  • 313 Views
  • 1 replies
  • 0 kudos

Request to create a cluster failed with an exception: RESOURCE_EXHAUSTED

My compute is erroring out with errorClusters are failing to launch. Cluster launch will be retried.Request to create a cluster failed with an exception: RESOURCE_EXHAUSTED: Cannot create the resource, please try again later.Any suggestions ?  

ManojkMohan_0-1754480993078.png
  • 313 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Databricks Employee
  • 0 kudos

Hello @ManojkMohan! This error may occur if your account has reached the serverless compute quota in the region. This quota limits the number of serverless compute resources you can use concurrently. You can find more details here: Quotas for Serverl...

  • 0 kudos
filipniziol
by Esteemed Contributor
  • 914 Views
  • 1 replies
  • 2 kudos

Merge slows down when the table grows with liquid clustering enabled.

Hi Everyone, I have a source table and target table and MERGE statement that is inserting/updating records every couple of minutes. The clustering keys are set up to match the 2 merge join columns.I noticed that with time the processing time increase...

  • 914 Views
  • 1 replies
  • 2 kudos
Latest Reply
kerem
Contributor
  • 2 kudos

Hi @filipniziol ,I dealt with a large table of about a TB in size with liquid clustering enabled. Even with Liquid Clustering, selects and joins on the clustered columns took longer as the table grew. So I don't think it performs as fast as the table...

  • 2 kudos
vamsi_simbus
by New Contributor III
  • 768 Views
  • 5 replies
  • 0 kudos

Databricks System Table system.billing.usage Not Capturing Job Data in Real-Time

We’ve observed that the system.billing.usage table in Databricks is not capturing job usage data in real-time. There appears to be a noticeable delay between when jobs are executed and when their corresponding usage records appear in the system table...

  • 768 Views
  • 5 replies
  • 0 kudos
Latest Reply
vamsi_simbus
New Contributor III
  • 0 kudos

Hi @szymon_dybczak ,Is there any alternative approach to find the DBU usage of current running jobs ? 

  • 0 kudos
4 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels