cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

samaiyanik
by New Contributor
  • 472 Views
  • 1 replies
  • 0 kudos

Resolved! Databricks Free Edition | RETRIES_EXCEEDED issue

Hi Team,I am not able to fire below commands i am getting error %sqlCREATE SCHEMA IF NOT EXISTS workspace.gold; The maximum number of retries has been exceeded.Tried all the available option but not working ThanksNikhil Samaiya

  • 472 Views
  • 1 replies
  • 0 kudos
Latest Reply
Advika
Community Manager
  • 0 kudos

Hello @samaiyanik! Could you please try the suggestions shared in the post below and let us know if that helps resolve the issue?Similar Post: error: [RETRIES_EXCEEDED] The maximum number of retries has been exceeded

  • 0 kudos
Subha0920
by Databricks Partner
  • 1448 Views
  • 3 replies
  • 1 kudos

Databricks recommended Approach to load data vault 2.0

Hi,Please share the recommended approach to load Data Vault 2.0 .Overview1. Current Landscape -  Lakehouse (Bronze/Silver/Gold)2. Data Vault 2.0 to be created in Silver layer.3. Bronze data will be made available in delta table using ETL Questions1. ...

  • 1448 Views
  • 3 replies
  • 1 kudos
Latest Reply
Subha0920
Databricks Partner
  • 1 kudos

Kindly provide your valuable input and suggestion for the above questions

  • 1 kudos
2 More Replies
camilo_s
by Databricks Partner
  • 4606 Views
  • 5 replies
  • 0 kudos

Spark SQL vs serverless SQL

Are there any benchmarks showing performance and cost differences between running SQL workloads on Spark SQL vs Databricks SQL (specially serverless SQL)?Our customer is hesitant about getting locked into Databricks SQL as opposed to being able to ru...

  • 4606 Views
  • 5 replies
  • 0 kudos
Latest Reply
maxwarior
New Contributor II
  • 0 kudos

Spark SQL serves as the SQL interface for Spark applications, whereas Databricks SQL is a more advanced, warehouse-optimized product built around SQL Warehouses, which utilize multiple Spark clusters. This architectural difference can lead to noticea...

  • 0 kudos
4 More Replies
habyphilipose
by New Contributor II
  • 1151 Views
  • 3 replies
  • 4 kudos

DLT table deletion

If we delete the DLT pipeline, the tables would get deleted.But in a DLT pipeline which creates 5 tables , if I comment out logic of 1 table, that table is not deleted from the catalog, even though full refresh of the pipeline is done.Does anyone kno...

  • 1151 Views
  • 3 replies
  • 4 kudos
Latest Reply
MartinIsti
Databricks Partner
  • 4 kudos

Don't confuse DLT and LDP (Lakeflow Declarative Pipelines) as though behind the scenes they work very similarly, the UI and the developer experience has changed immensely and very important new features have been added. I used DLT extensively and in ...

  • 4 kudos
2 More Replies
ChristianRRL
by Honored Contributor
  • 539 Views
  • 1 replies
  • 0 kudos

Troubleshooting AutoLoader

Hi there, I am running into a bit of an issue displaying some AutoLoader readStream data. Can I get some assistance to understand how to properly troubleshoot this? I've looked at logs before, but frankly it's not clear where to look exactly:First, "...

ChristianRRL_2-1754495673026.png ChristianRRL_1-1754495653701.png
  • 539 Views
  • 1 replies
  • 0 kudos
Latest Reply
MartinIsti
Databricks Partner
  • 0 kudos

I'm also working with AutoLoader these days to create an ingestion pattern and troubleshooting it can be tricky.I wonder if you could pick a single file (whose full path / location / uri you exactly know) and read it without autoloader. Just with spa...

  • 0 kudos
ManojkMohan
by Honored Contributor II
  • 446 Views
  • 1 replies
  • 2 kudos

Resolved! Sample Data Reflecting but Uploaded File reflecting

Step1: I uploaded CSV file manually in data bricks  Step 2: Connector created and active between Salesforce and DatabricksStep 3: Creating Data Streams in Salesforce Data CloudSample Topics are reflecting , matching between what i see in data bricks ...

ManojkMohan_0-1754495848903.png ManojkMohan_1-1754495874477.png ManojkMohan_2-1754495949156.png ManojkMohan_3-1754495963763.png
  • 446 Views
  • 1 replies
  • 2 kudos
Latest Reply
ManojkMohan
Honored Contributor II
  • 2 kudos

I resolved it myselfStep1: workspace --> manage permissions step 2: chose all permissionsstep 3: went to raw uploaded file and share via delta sharingStep4: in salesforce data stream i got the raw file 

  • 2 kudos
Shruti12
by Databricks Partner
  • 2641 Views
  • 2 replies
  • 1 kudos

Databricks support updating multiple target rows with single matching source row in merge query?

Hi,I am getting this error in merge statement. DeltaUnsupportedOperationException: Cannot perform Merge as multiple source rows matched and attempted to modify the same target row in the Delta table in possibly conflicting ways.Does Databricks suppor...

  • 2641 Views
  • 2 replies
  • 1 kudos
Latest Reply
Shruti12
Databricks Partner
  • 1 kudos

Hi @szymon_dybczak ,Thanks for your reply. The above code is working fine which means multiple updates can be done from a single source target. So, it may be when there are complex matching conditions/values, merge query gives error.I cannot send you...

  • 1 kudos
1 More Replies
arsamkull
by New Contributor III
  • 8084 Views
  • 6 replies
  • 6 kudos

Usage of Azure DevOps System.AccessToken as PAT in Databricks

Hi there! I'm trying to use Azure DevOps Pipeline to automate Azure Databricks Repos API. Im using the following workflow:Get an Access Token for a Databricks Service Principal using a Certificate (which works great)Usage REST Api to generate Git Cre...

  • 8084 Views
  • 6 replies
  • 6 kudos
Latest Reply
Srihasa_Akepati
Databricks Employee
  • 6 kudos

@Adrian Ehrsam​ The PAT limit has been increased to 2048 now. Please check.

  • 6 kudos
5 More Replies
filipniziol
by Esteemed Contributor
  • 1391 Views
  • 1 replies
  • 2 kudos

Merge slows down when the table grows with liquid clustering enabled.

Hi Everyone, I have a source table and target table and MERGE statement that is inserting/updating records every couple of minutes. The clustering keys are set up to match the 2 merge join columns.I noticed that with time the processing time increase...

  • 1391 Views
  • 1 replies
  • 2 kudos
Latest Reply
kerem
Contributor
  • 2 kudos

Hi @filipniziol ,I dealt with a large table of about a TB in size with liquid clustering enabled. Even with Liquid Clustering, selects and joins on the clustered columns took longer as the table grew. So I don't think it performs as fast as the table...

  • 2 kudos
vamsi_simbus
by Databricks Partner
  • 1704 Views
  • 5 replies
  • 0 kudos

Databricks System Table system.billing.usage Not Capturing Job Data in Real-Time

We’ve observed that the system.billing.usage table in Databricks is not capturing job usage data in real-time. There appears to be a noticeable delay between when jobs are executed and when their corresponding usage records appear in the system table...

  • 1704 Views
  • 5 replies
  • 0 kudos
Latest Reply
vamsi_simbus
Databricks Partner
  • 0 kudos

Hi @szymon_dybczak ,Is there any alternative approach to find the DBU usage of current running jobs ? 

  • 0 kudos
4 More Replies
malla_aayush
by Databricks Partner
  • 810 Views
  • 2 replies
  • 1 kudos

Resolved! Not able to find lab for Data Engineering Learning Path

I am not able to find the data engineering learning path , i did open partner databricks academy lab which redirected to uplimit where i also enrolled myself to instructor led course but not able to see any labs.

  • 810 Views
  • 2 replies
  • 1 kudos
Latest Reply
junaid-databrix
New Contributor III
  • 1 kudos

You are right the self paced e-learning courses does not include any labs. However, they are available on instructor led courses available on Uplimit. I recently enrolled for one and here is how it worked for me:1. On Uplimit portal enroll for an upc...

  • 1 kudos
1 More Replies
susanne
by Databricks Partner
  • 1641 Views
  • 3 replies
  • 0 kudos

Resolved! Authentication failure Lakeflow SQL Server Ingestion

Hi all I am trying to create a Lakeflow Ingestion Pipeline for SQL Server, but I am running into the following authentication error when using my Databricks Database User for the connection:Gateway is stopping. Authentication failure while obtaining ...

  • 1641 Views
  • 3 replies
  • 0 kudos
Latest Reply
susanne
Databricks Partner
  • 0 kudos

Hi @szymon_dybczak,thanks a lot, that did the trick

  • 0 kudos
2 More Replies
Alena
by New Contributor II
  • 714 Views
  • 1 replies
  • 0 kudos

Programmatically set minimum workers for a job cluster based on file size?

I’m running an ingestion pipeline with a Databricks job:A file lands in S3A Lambda is triggeredThe Lambda runs a Databricks jobThe incoming files vary a lot in size, which makes processing times vary as well. My job cluster has autoscaling enabled, b...

  • 714 Views
  • 1 replies
  • 0 kudos
Latest Reply
kerem
Contributor
  • 0 kudos

Hi Alena, Jobs API has update functionality to be able to do that: https://docs.databricks.com/api/workspace/jobs_21/updateIf for some reason you can’t update your pipeline before you trigger it you can also consider creating a new job with desired c...

  • 0 kudos
Nick_Pacey
by New Contributor III
  • 908 Views
  • 2 replies
  • 0 kudos

Question on best method to deliver Azure SQL Server data into Databricks Bronze and Silver.

Hi,We have a Azure SQL Server (replicating from an On Prem SQL Server) that is required to be in Databricks bronze and beyond.This database has 100s of tables that are all required.  Size of tables will vary from very small up to the biggest tables 1...

  • 908 Views
  • 2 replies
  • 0 kudos
Latest Reply
kerem
Contributor
  • 0 kudos

Hey Nick,Have you tried the SQL Server connector with Lakeflow Connect? This should provide native connection to your SQL server, potentially allowing for incremental updates and CDC setup. https://learn.microsoft.com/en-us/azure/databricks/ingestion...

  • 0 kudos
1 More Replies
yit
by Databricks Partner
  • 563 Views
  • 1 replies
  • 0 kudos

Unable to Upcast DECIMAL Field in Autoloader

I’m using Autoloader to read Parquet files and write them to a Delta table. I want to enforce a schema in which Column1 is defined as DECIMAL(10,2). However, in the Parquet files being ingested, Column1 is defined as DECIMAL(8,2).When Autoloader read...

  • 563 Views
  • 1 replies
  • 0 kudos
Latest Reply
kerem
Contributor
  • 0 kudos

Hi Yit,To potentially simplify your issue, why not read this column as String in your stream and then cast it to DECIMAL(10, 2) afterwards? That should eliminate the rescue behaviour. Kerem Durak

  • 0 kudos
Labels