cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

oakhill
by New Contributor III
  • 157 Views
  • 2 replies
  • 0 kudos

Unable to read Delta Table using external tools

I am using the new credential vending API to get tokens and url for my tables in Unity Catalog.I get the token, url and I am able to scan the folder using read_parquet, but NOT any Delta Lake functions. Not TableExists, scan_delta or delta_scan from ...

  • 157 Views
  • 2 replies
  • 0 kudos
Latest Reply
Bernard295Clark
New Contributor II
  • 0 kudos

Hello!It sounds like you're encountering issues when trying to read Delta Lake tables using Polars and DuckDB, but not with read_parquet. This could be due to Databricks-specific configurations required for Delta Lake tables. Ensure you're using the ...

  • 0 kudos
1 More Replies
aliacovella
by Contributor
  • 154 Views
  • 3 replies
  • 1 kudos

Resolved! Custom Checkpointing

The following is my scenario:I need to query on a daily basis from an external table that maintains a row versionI would like to be able to query for all records where the row version is greater than the max previously processed row version. The sour...

  • 154 Views
  • 3 replies
  • 1 kudos
Latest Reply
jeremy98
Contributor
  • 1 kudos

Hi, I totally agree with VZLA, within my internal team we have a similar issue and we used a table to track the latest versions of each table, since we haven't a streaming process in our side. DLT pipelines could be a choice, but depends also if you ...

  • 1 kudos
2 More Replies
ashraf1395
by Valued Contributor II
  • 127 Views
  • 3 replies
  • 0 kudos

Databricks Workflow design

I have 7 - 8 different dlt pipelines which have to be run at the same time according to their batch type i.e. hourly and daily. Right now they are triggered effectively according to their batch type. I want to move to a next stage where I want to clu...

  • 127 Views
  • 3 replies
  • 0 kudos
Latest Reply
ashraf1395
Valued Contributor II
  • 0 kudos

Hi @VZLA , I got the idea. There will be a small change in the way, we will use it. Since we don't schedule the workflow in databricks we trigger it using the API. So I will pass a job parameter along with the trigger according to the timestamp wheth...

  • 0 kudos
2 More Replies
maddan80
by New Contributor II
  • 166 Views
  • 3 replies
  • 0 kudos

History load from Source and

Hi As part of our requirement we wanted to load a huge historical data from the Source System to Databricks in Bronze and then process it to Gold, We wanted to use batch with read and Write so that the historical load is done and then for the delta o...

  • 166 Views
  • 3 replies
  • 0 kudos
Latest Reply
MariuszK
Contributor
  • 0 kudos

I imported 16 TB of data using ADF. In this scenario I'd create a process that will extract from a source data using ADF and then execute the rest of logic to populate tables in the gold. For the new data I'd create a separate process using Autoloade...

  • 0 kudos
2 More Replies
javiomotero
by New Contributor III
  • 361 Views
  • 4 replies
  • 3 kudos

How to consume Fabric Datawarehouse inside a Databricks notebook

Hello,I'm having a hard time figuring out (and finding the right  documentation) to be able to connect my databricks notebook to consume tables from a fabric datawarehouse. I've checked this, but seems to work only with onelake and this, but I'm not ...

Data Engineering
datawarehouse
fabric
  • 361 Views
  • 4 replies
  • 3 kudos
Latest Reply
javiomotero
New Contributor III
  • 3 kudos

Hello, I would like to get a bit more options regarding reading Views. Using the abfss is fine for reading tables, but I don't know how to load Views, which are visible in the SQL Endpoint. Is there any alternative for connecting to Fabric and be abl...

  • 3 kudos
3 More Replies
Avinash_Narala
by Valued Contributor II
  • 229 Views
  • 3 replies
  • 4 kudos

Redshift to Databricks Migration

Hi,I want a detailed plan steps to migrate my data from redshift to databricks.where to start, what to assess and what to migrate.It could really help me if you provide the detailed explaination on migration.Thanks in Advance.

  • 229 Views
  • 3 replies
  • 4 kudos
Latest Reply
MariuszK
Contributor
  • 4 kudos

I migrated Oracle to Databricks and have an experience with Redshift. The cost and effort will depend on your technical stuck:- What do you use for ETL?- What do you use for data ingestion?- Reporting tools?In general, simplest steps are: data and mo...

  • 4 kudos
2 More Replies
ahen
by New Contributor
  • 378 Views
  • 1 replies
  • 0 kudos

Deployed DABs job via Gitlab CICD. It is creating duplicate jobs.

We had error in DABs deploy and then subsequent retries resulted in a locked stateAnd as suggested in the logs, we use --force-lock option and the deploy succeededHowever, it created duplicate jobs for all assets in the bundle instead of updating the...

  • 378 Views
  • 1 replies
  • 0 kudos
Latest Reply
Satyadeepak
Databricks Employee
  • 0 kudos

@ahen When you used the --force-lock option during the Databricks Asset Bundle (DAB) deployment, it likely bypassed certain checks that would normally prevent duplicate resource creation. This option is used to force a deployment even when a lock is ...

  • 0 kudos
shubham_007
by Contributor II
  • 855 Views
  • 6 replies
  • 0 kudos

Resolved! Need urgent help and guidance on information/details with reference links on below topics:

Dear experts,I need urgent help and guidance on information/details with reference links on below topics:Steps on Package Installation with Serverless in Databricks.What are Delta Lake Connector with serverless ? How to run Delta Lake queries outside...

  • 855 Views
  • 6 replies
  • 0 kudos
Latest Reply
brockb
Databricks Employee
  • 0 kudos

Were you able to review the documentation provided here: https://docs.databricks.com/en/compute/serverless/dependencies.html#install-notebook-dependencies?

  • 0 kudos
5 More Replies
mrkure
by New Contributor II
  • 137 Views
  • 2 replies
  • 0 kudos

Databricks connect, set spark config

Hi, Iam using databricks connect to compute with databricks cluster. I need to set some spark configurations, namely spark.files.ignoreCorruptFiles. As I have experienced, setting spark configuration in databricks connect for the current session, has...

  • 137 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Have you tried setting it up in your code as: from pyspark.sql import SparkSession # Create a Spark session spark = SparkSession.builder \ .appName("YourAppName") \ .config("spark.files.ignoreCorruptFiles", "true") \ .getOrCreate() # Yo...

  • 0 kudos
1 More Replies
AlexSantiago
by New Contributor II
  • 3508 Views
  • 16 replies
  • 4 kudos

spotify API get token - raw_input was called, but this frontend does not support input requests.

hello everyone, I'm trying use spotify's api to analyse my music data, but i'm receiving a error during authentication, specifically when I try get the token, above my code.Is it a databricks bug?pip install spotipyfrom spotipy.oauth2 import SpotifyO...

  • 3508 Views
  • 16 replies
  • 4 kudos
Latest Reply
markjohn235
New Contributor II
  • 4 kudos

Hi there,It seems you're facing the common issue with the Spotify API authentication, where interactive input is not supported in your current environment (like Databricks or other cloud platforms). A possible workaround is to use the Spotify Client ...

  • 4 kudos
15 More Replies
Buranapat
by New Contributor II
  • 810 Views
  • 4 replies
  • 4 kudos

Error when accessing 'num_inserted_rows' in Spark SQL (DBR 15.4 LTS)

Hello Databricks Community,I've encountered an issue while trying to capture the number of rows inserted after executing an SQL insert statement in Databricks (DBR 15.4 LTS). My code is attempting to access the number of inserted rows as follows: row...

Buranapat_4-1727751428815.png Buranapat_3-1727750986067.png
  • 810 Views
  • 4 replies
  • 4 kudos
Latest Reply
GeorgeP1
New Contributor II
  • 4 kudos

Hi,we are experiencing the same issue. We also turned on liquid clustering on table and we had additional checks on the inserted data information, which was really helpful.@GavinReeves3 did you manage to solve the issue?@MuthuLakshmi any idea? Thank ...

  • 4 kudos
3 More Replies
zg
by New Contributor II
  • 190 Views
  • 4 replies
  • 3 kudos

Resolved! Unable to Create Alert Using API

Hi All, I'm trying to create an alert using the Databricks REST API, but I keep encountering the following error:Error creating alert: 400 {"message": "Alert name cannot be empty or whitespace"}:{"alert": {"seconds_to_retrigger": 0,"display_name": "A...

  • 190 Views
  • 4 replies
  • 3 kudos
Latest Reply
filipniziol
Contributor III
  • 3 kudos

Hi @zg ,You are sending the payload related to the new endpoint (/api/2.0/sql/alerts) to the old endpoint (/api/2.0/preview/sql/alerts).That are the docs of the old endpoint:https://docs.databricks.com/api/workspace/alertslegacy/createAs you can see ...

  • 3 kudos
3 More Replies
chad_woodhead
by New Contributor
  • 2088 Views
  • 4 replies
  • 0 kudos

Unity Catalog is missing column in Catalog Explorer

I have just altered one of my tables and added a column.ALTER TABLE tpch.customer ADD COLUMN C_CUSTDETAILS struct<key:string,another_key:string,boolean_key:boolean,extra_key:string,int_key:long,nested_object:struct<more:long,arrayOne:array<string>>>A...

chad_woodhead_0-1706220653227.png chad_woodhead_1-1706220693600.png
  • 2088 Views
  • 4 replies
  • 0 kudos
Latest Reply
Lakshay
Databricks Employee
  • 0 kudos

Can you share the steps for reproducing this issue? If it is reproducible, I can investigate.

  • 0 kudos
3 More Replies
Mattias
by New Contributor II
  • 223 Views
  • 3 replies
  • 0 kudos

How to increase timeout in Databricks Workflows DBT task

Hi,I have a Databricks Workflows DBT task that targets a PRO SQL warehouse. However, the task fails with a "to many retries" error (see below) if the PRO SQL warehouse is not up and running when the task starts. How can I increase the timeout or allo...

  • 223 Views
  • 3 replies
  • 0 kudos
Latest Reply
Mattias
New Contributor II
  • 0 kudos

One option seems to be to reference a custom "profiles.yml" in the job configuration and specify a custom DBT Databricks connector timeout there (https://docs.getdbt.com/docs/core/connect-data-platform/databricks-setup#additional-parameters).However,...

  • 0 kudos
2 More Replies
Mkk1
by New Contributor
  • 1108 Views
  • 1 replies
  • 0 kudos

Joining tables across DLT pipelines

How can I join a silver table (s1) from a DLT pipeline (D1) to another silver table (S2) from a different DLT pipeline (D2)?#DLT #DeltaLiveTables

  • 1108 Views
  • 1 replies
  • 0 kudos
Latest Reply
JothyGanesan
New Contributor II
  • 0 kudos

@Mkk1 Did you get to get this completed? We are in the similar situation, how did you get to acheive this?

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels