cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

EricCournarie
by New Contributor III
  • 34 Views
  • 6 replies
  • 9 kudos

ResultSet metadata does not return correct type for TIMESTAMP_NTZ

Hello, using the JDBC driver, when I retrieve the metadata of a ResultSet, the type for a TIMESTAMP_NTZ is not correct (it's a TIMESTAMP one).My SQL is a simple SELECT * on a table where you have a TIMESTAMP_NTZ columnThis works when retrieving metad...

  • 34 Views
  • 6 replies
  • 9 kudos
Latest Reply
EricCournarie
New Contributor III
  • 9 kudos

Hi, thanks for the response !ok,as it was working on table metadata, I thought the doc was not up to date.. so it's partially supported. do you know if there is any chance it will be fully supported in some 'near' future ?Thanks

  • 9 kudos
5 More Replies
gzr58l
by Visitor
  • 26 Views
  • 1 replies
  • 0 kudos

How to setup lakeflow HTTP for connector with M2M Authentication

I am getting the following error about content-type with no option to pick a different content-type when configuring the lakeflow connectorThe OAuth token exchange failed with HTTP status code 415 Unsupported Media Type. The returned server response ...

  • 26 Views
  • 1 replies
  • 0 kudos
Latest Reply
saurabh18cs
Honored Contributor II
  • 0 kudos

Hi @gzr58l are you configuring a custom Lakeflow connector or external connection in Databricks? Also, consider using a service principal or personal access token (PAT) for authentication as a temporary workaround.

  • 0 kudos
data-grassroots
by New Contributor III
  • 34 Views
  • 2 replies
  • 0 kudos

ExcelWriter and local files

I have a couple things going on here.First, to explain what I'm doing, I'm passing an array of objects in to a function that contain a dataframe per item. I want to write those dataframes to an excel workbook - one dataframe per worksheet. That part ...

  • 34 Views
  • 2 replies
  • 0 kudos
Latest Reply
data-grassroots
New Contributor III
  • 0 kudos

Here's a pretty easy way to recreate the issue - simplified to ignore the ExcelWriter part...You can see the file is copied and shows up when listed. But can't be find from Pandas. Same behavior on local_disk0 and /tmp

  • 0 kudos
1 More Replies
JeffSeaman
by New Contributor II
  • 404 Views
  • 8 replies
  • 1 kudos

Resolved! JDCB Error trying a get schemas call.

Hi Community,I have a free demo version and can create a jdbc connection and get metadata (schema, table, and columns structure info). Everything works as described in the docs, but when working with someone who has a paid version of databricks the s...

  • 404 Views
  • 8 replies
  • 1 kudos
Latest Reply
BigRoux
Databricks Employee
  • 1 kudos

@JeffSeaman , please let us know if any of my suggestions help get you on the right track. If they do, kindly mark the post as "Accepted Solution" so others can benefit as well. Cheers, Louis.

  • 1 kudos
7 More Replies
jakesippy
by New Contributor II
  • 355 Views
  • 7 replies
  • 14 kudos

Resolved! How to get pipeline update duration programmatically

I'm looking to track how much time is being spent running updates for my DLT pipelines.When querying the list pipeline updates REST API endpoint I can see start and end times being returned, however, these fields are not listed in the documentation. ...

  • 355 Views
  • 7 replies
  • 14 kudos
Latest Reply
jakesippy
New Contributor II
  • 14 kudos

Originally went with the approach of exporting to and reading from the event log table, which has been helpful for getting other metrics as well.Also found today that there is a new system table is in public preview which exposes the durations I was ...

  • 14 kudos
6 More Replies
VIRALKUMAR
by Contributor II
  • 8776 Views
  • 5 replies
  • 0 kudos

How to Determine the Cost for Each Query Run Against SQL Warehouse Serverless?

Hello Everyone.First of all, I would like to thank you to databricks to enable system tables for customers. It does help a lot. I am working on cost optimization topic. Particularly sql warehouse serverless. I am not sure all of you have tried system...

  • 8776 Views
  • 5 replies
  • 0 kudos
Latest Reply
skumarraj
Visitor
  • 0 kudos

Can you Share the query that you used ?

  • 0 kudos
4 More Replies
Gvnreddy
by Visitor
  • 52 Views
  • 2 replies
  • 2 kudos

Need Help to learn scala

Hi Enthusiasts, recently i joined company in that company they used to develope databricks notebook with Scala programming language perviously, i worked on Pyspark it was very easy for me by the way i have 3 years of experence in DE i need help to wh...

  • 52 Views
  • 2 replies
  • 2 kudos
Latest Reply
nayan_wylde
Honored Contributor III
  • 2 kudos

If you want some hands-on with basic scala. I would recommend this course.https://www.coursera.org/learn/packt-apache-spark-with-scala-hands-on-with-big-data-hilnz 

  • 2 kudos
1 More Replies
iskidet
by New Contributor
  • 40 Views
  • 1 replies
  • 1 kudos

Declarative Pipeline Failure for Autoloader

Hello Folks, After moving my working serverless Auto Loader notebook to a declarative (DLT) pipeline, I’m getting an AccessDenied error. What could be causing this?”here is the DLT json  and error message in the DLT I googled around got saw some hint...

iskidet_1-1759326363933.png iskidet_4-1759326671286.png iskidet_5-1759327166901.png iskidet_2-1759326493353.png
  • 40 Views
  • 1 replies
  • 1 kudos
Latest Reply
nayan_wylde
Honored Contributor III
  • 1 kudos

Seems to me like the SPN or individual running the DLT pipeline need to have access on the external location and the main catalog. GRANT ALL PRIVILEGES ON CATALOG <catalog_name> TO `<external_account_identifier>`;

  • 1 kudos
aranjan99
by Contributor
  • 159 Views
  • 4 replies
  • 2 kudos

Databricks Pipeline SDK misisng fields

Looking at the Databricks Java SDK for Pipeline events, I see that the Rest API returns a details field that has the same information as event log details. But this is not surfaced in SDK, should be a small change to add it. Is that something which c...

  • 159 Views
  • 4 replies
  • 2 kudos
Latest Reply
ManojkMohan
Honored Contributor
  • 2 kudos

The start and end time fields in the Pipeline Updates API are currently present in the Databricks REST API but are not yet supported (i.e., not included or mapped) in the Databricks Java SDK as of September 2025.This means:You can see these fields (s...

  • 2 kudos
3 More Replies
manish24101981
by New Contributor
  • 1487 Views
  • 1 replies
  • 1 kudos

DLT or DataBricks for CDC and NRT

We are currently delivering a large-scale healthcare data migration project involving:One-time historical migration of approx. 80 TB of data, already completed and loaded into Delta Lake.CDC merge logic is already developed and validated using Apache...

  • 1487 Views
  • 1 replies
  • 1 kudos
Latest Reply
mark_ott
Databricks Employee
  • 1 kudos

For cost-sensitive, large-scale healthcare data streaming scenarios, using Delta Live Tables (DLT) for both CDC and streaming (Option C) is generally the most scalable, manageable, and cost-optimized approach. DLT offers native support for structured...

  • 1 kudos
allancruz
by New Contributor
  • 1466 Views
  • 1 replies
  • 0 kudos

Embedding Dashboards on Databricks Apps

Hi Team,I recently tried the Hello World template and embedded the <iframe> from the dashboard that I created. It works properly fine before I added some code to have a Login Form (I used Dash Plotly on creating the Login Form) before the dashboard a...

  • 1466 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

in databricks, I recently tried the Hello World template and embedded the <iframe> from the dashboard that I created. It works properly fine before I added some code to have a Login Form (I used Dash Plotly on creating the Login Form) before the...

  • 0 kudos
databricksdata
by New Contributor
  • 1354 Views
  • 1 replies
  • 0 kudos

Assistance Required with Auto Liquid Clustering Implementation Challenges

Hi Databricks Team,We are currently implementing Auto Liquid Clustering (ALC) on our Delta tables as part of our data optimization efforts. During this process, we have encountered several challenges and would appreciate your guidance on best practic...

  • 1354 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

To implement Auto Liquid Clustering (ALC) on Delta tables in Databricks, especially when transitioning from external partitioned tables to unpartitioned managed tables, a careful and ordered process is crucial to avoid data duplication and ensure con...

  • 0 kudos
saicharandeepb
by New Contributor II
  • 1593 Views
  • 1 replies
  • 0 kudos

Understanding High I/O Wait Despite High CPU Utilization in system.compute Metrics

Hi everyone,I'm working on building a hardware metrics dashboard using the system.compute schema in Databricks, specifically leveraging the cluster, node_type, and node_timeline tables.While analyzing the data, I came across something that seems cont...

  • 1593 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

Your observation highlights a subtlety in interpreting CPU metrics, especially in distributed environments like Databricks, where cluster and node-level behaviors can diverge from typical single-server intuition. Direct Answer No, seeing both high cp...

  • 0 kudos
piotrsofts
by New Contributor III
  • 1274 Views
  • 1 replies
  • 0 kudos

LakeFlow Connect-&gt;GA4 - creation of Liquid Clustered stream table

Hello While creating new Data Ingestion from GA4, can we set-up Liquid Clustering (either Manual or Automatical) on destination table which will contain fetched data from GA4? 

  • 1274 Views
  • 1 replies
  • 0 kudos
Latest Reply
mark_ott
Databricks Employee
  • 0 kudos

Yes, in Databricks, it is possible to set up Liquid Clustering—both manual and automatic—on destination tables that store data ingested from Google Analytics 4 (GA4). This feature significantly improves table management and query performance compared...

  • 0 kudos
Yogesh_Verma_
by Contributor
  • 42 Views
  • 0 replies
  • 1 kudos

Real-Time Mode in Apache Spark Structured Streaming

Real-Time Mode in Spark StreamingApache Spark™ Structured Streaming has been the backbone of mission-critical pipelines for years — from ETL to near real-time analytics and machine learning.Now, Databricks has introduced something game-changing: Real...

Yogesh_378691_1-1759318181584.png
  • 42 Views
  • 0 replies
  • 1 kudos

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels