cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

shan-databricks
by Databricks Partner
  • 55 Views
  • 1 replies
  • 2 kudos

Ingestion Gateway DDL Objects Missing - Lakeflow Connect

Facing below issue and need a solution to proceed furtherCategory: ErrorMessage: DDL objects missing on table 'DB.dbo.client'. Execute the DDL objects script and full refresh the table on the Ingestion Pipeline. Error message: 'Reason:- Catalog is no...

  • 55 Views
  • 1 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Hi @shan-databricks ,The INGESTION_GATEWAY_DDL_OBJECTS_MISSING error means that while CDC is enabled on DB.dbo.client, the LakeFlow-specific DDL support objects (triggers, stored procedures) that allow Databricks to track schema changes (DDL events l...

  • 2 kudos
NageshPatil
by New Contributor III
  • 759 Views
  • 5 replies
  • 1 kudos

Resolved! Lakeflow partial data ingestion for first load

Hi Team,I am doing ingestion of 10 tables from Azure SQL through Lakeflow connect. I have created gateway and ingestion pipelines using databricks SDK. I am starting ingestion pipeline only when gateway is in Running status with resources. I observed...

  • 759 Views
  • 5 replies
  • 1 kudos
Latest Reply
NageshPatil
New Contributor III
  • 1 kudos

HiI finally found a solution that works smoothly to capture the full snapshot on the initial run. Here is the step-by-step approach I implemented:Create a Status Check Function: I wrote a custom function that queries the event_log for a given Pipelin...

  • 1 kudos
4 More Replies
Bank_Kirati
by New Contributor III
  • 134 Views
  • 2 replies
  • 0 kudos

Cross-region S3 reads suddenly fail with 400 Bad Request — eu-west-1 metastore to af-south-1 bucket

What changedA production daily job that has worked unchanged for ~8 months started failing on 2026-05-18 ~23:46 UTC. The notebook does a plain spark.read.json("s3://BUCKET/...") against a bucket in af-south-1. The metastore is in eu-west-1. Same code...

  • 134 Views
  • 2 replies
  • 0 kudos
Latest Reply
sameer_yasser
New Contributor
  • 0 kudos

Your debugging is really thorough and you've already done the hard work of isolating this. The 400 with an empty body (no proper S3 error code like InvalidArgument) on an opt-in region is almost always one thing: SigV4 signing region mismatch. af-sou...

  • 0 kudos
1 More Replies
plankton
by New Contributor
  • 627 Views
  • 11 replies
  • 6 kudos

Resolved! R plots not rendering

Has anyone been experiencing the issue of R plots not rendering in notebooks, starting a few days ago?t's not related to splarkly or plotly, or specifc data types, or anything. For example in base R: plot(1:3, 5:7) calculates without error, but does ...

  • 627 Views
  • 11 replies
  • 6 kudos
Latest Reply
plankton
New Contributor
  • 6 kudos

Looks like the issue has been resolved. Thanks everyone for chiming in and thanks 'bricks for whatever you did to resolve this.Plankton out!

  • 6 kudos
10 More Replies
seefoods
by Valued Contributor
  • 375 Views
  • 1 replies
  • 1 kudos

DQX - datacontract cli

Hello Guyz, Someone can i combine dqx databricks rules check with datacontract cli ? If yes can we share your idea? https://gpt.datacontract.com/sources/cli.datacontract.com/Cordially, 

  • 375 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 1 kudos

Hi @seefoods, Just came across this post. In case you are still looking for an answer, I see these as complementary rather than overlapping tools. A practical approach would be to keep the data contract as the source of truth in datacontract.yaml, us...

  • 1 kudos
IM_01
by Contributor III
  • 269 Views
  • 4 replies
  • 2 kudos

Lakeflow SDP partition error

Hi,I was trying to log an exception in Lakeflow SDP , firstly I am creating an empty streaming dataframe in case of exception and writing log into audit table as shown belowtry: raise Exception("testexception") return df except Exception as e: df=...

  • 269 Views
  • 4 replies
  • 2 kudos
Latest Reply
IM_01
Contributor III
  • 2 kudos

Hi AmiraAs the flows run in parallel, if I use file based logger it might throw exception , so was thinking to go with logging to table as I do not want exception in any of the flow to fail entire pipeline.

  • 2 kudos
3 More Replies
Shivaprasad
by Contributor
  • 162 Views
  • 4 replies
  • 1 kudos

Resolved! Can we able to create materialized view in databricks using all purpose cluster

I was unable to create materialized view in databricks using all purpose cluster wanted to check do we need serverless cluster to create MV

  • 162 Views
  • 4 replies
  • 1 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 1 kudos

Hi @Shivaprasad, You generally should not create a standalone materialised view from an all-purpose cluster. Databricks documents that CREATE MATERIALIZED VIEW is supported from a Pro or Serverless SQL warehouse, or within a pipeline. For standalone ...

  • 1 kudos
3 More Replies
batch_bender
by New Contributor II
  • 150 Views
  • 3 replies
  • 0 kudos

Resolved! Does liquid clustering preserve auditable tenant separation in a shared Delta table architecture?

We’re evaluating a multi-tenant Databricks architecture and considering Liquid Clustering on shared Delta tables. Our concern is that tenant SLAs require data separation for audit/compliance purposes. I’m trying to understand whether Liquid Clusterin...

  • 150 Views
  • 3 replies
  • 0 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 0 kudos

Hi @batch_bender, I think the key distinction is between data layout for performance and isolation as a control boundary. My view is that Liquid Clustering should not be presented as a tenant-isolation mechanism. The official docs describe it as a da...

  • 0 kudos
2 More Replies
ha2hi
by New Contributor
  • 162 Views
  • 2 replies
  • 3 kudos

Resolved! [Auto Loader] Inquiry regarding Checkpoint files

Hi,I am currently using Auto Loader to load files stored in the cloud into Databricks tables. I understand that checkpoint files are continuously generated during this process.I have a couple of questions regarding these files:Do these checkpoint fil...

  • 162 Views
  • 2 replies
  • 3 kudos
Latest Reply
balajij8
Contributor III
  • 3 kudos

Never delete or alter files inside a checkpoint directory manually as it will corrupt the auto loader streams.Auto Loader keeps track of discovered files in the checkpoint location using Rocks DB to provide exactly once ingestion guarantees.You can u...

  • 3 kudos
1 More Replies
maikel
by Contributor III
  • 259 Views
  • 3 replies
  • 1 kudos

Resolved! Job tasks monitoring

Hello Community,We have a case in our project that we would like to solve in an elegant and scalable manner. As always, I would really appreciate your suggestions and experience.In short:We have a multi-step job consisting of 4 stages. In one of the ...

  • 259 Views
  • 3 replies
  • 1 kudos
Latest Reply
maikel
Contributor III
  • 1 kudos

@MoJaMa thanks a lot for these suggestions!

  • 1 kudos
2 More Replies
ccsalt
by New Contributor II
  • 237 Views
  • 3 replies
  • 0 kudos

Inconsistent Cluster Log Persistence to Volume/S3 (stderr, stdout, log4j-active.log)

Saving logs from an all-purpose cluster to Volume or S3 is not consistent, because stderr, stdout, and log4j-active.log get overwritten when the cluster is restarted between minutes 01 and 59.Tested case:A job is configured to start every 20 minutes,...

  • 237 Views
  • 3 replies
  • 0 kudos
Latest Reply
aleksandra_ch
Databricks Employee
  • 0 kudos

Hi @ccsalt , This is a known limitation. Log rotation (renaming to log4j-YYYY-MM-DD-HH.log.gz) only happens on the hour boundary. The active log file log4j-active.log has always the same name and is overwritten if a cluster restart happens within one...

  • 0 kudos
2 More Replies
KSharmaDE
by New Contributor
  • 177 Views
  • 3 replies
  • 0 kudos

Import Data from Databricks to SQL Server

Hi our team wants to import data from Databricks catalog tables to SQL server.Is it possible to do so using SSIS package on SQL server ? what settings are required on Databricks tables?Suggest me some ETL tools and how to do it using SSIS

  • 177 Views
  • 3 replies
  • 0 kudos
Latest Reply
sudhirr
New Contributor II
  • 0 kudos

Yes, it is possible to integrate SSIS packages with Delta tables using JDBC/ODBC connectivity.Required on Databricks side:SQL Warehouse or interactive clusterJDBC/ODBC driverHostname, HTTP Path, Port 443, and PAT tokenProper table permissions in Unit...

  • 0 kudos
2 More Replies
loujiang
by New Contributor II
  • 205 Views
  • 1 replies
  • 0 kudos

Resolved! Databricks Runtime, Pyspark and Spark Versions

Hello, Dear community,I was go through the documentation of function from_xml here pyspark.sql.functions.from_xml — PySpark 4.1.2 documentation, it denotes that it is available in pyspark version higher than 4.0.0. Meanwhile, we have documentation fo...

  • 205 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @loujiang ,Databricks Runtime is not a vanilla Apache Spark distribution. DBR is built on top of a highly optimized version of Apache Spark, but also adds enhancements and additional components that substantially improve usability, performance, an...

  • 0 kudos
micheloh
by New Contributor
  • 249 Views
  • 4 replies
  • 1 kudos

Resolved! Create External Catalog when dbname has special characters

Hi experts,I'm having a problem when trying to create an external catalog with my PostgreSQL database. The connection is fine. But the database name that I want to connect has dashes and colon (eg. my-db-prod:all). When trying to connect with it, I a...

  • 249 Views
  • 4 replies
  • 1 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 1 kudos

Hi @micheloh, From what we’ve seen, this is currently a limitation of Lakehouse Federation foreign catalog creation rather than a problem with the connection itself. The PostgreSQL connection can succeed, but the database value used when creating the...

  • 1 kudos
3 More Replies
Labels