cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

batch_bender
by New Contributor
  • 20 Views
  • 0 replies
  • 0 kudos

Does liquid clustering preserve auditable tenant separation in a shared Delta table architecture?

We’re evaluating a multi-tenant Databricks architecture and considering Liquid Clustering on shared Delta tables. Our concern is that tenant SLAs require data separation for audit/compliance purposes. I’m trying to understand whether Liquid Clusterin...

  • 20 Views
  • 0 replies
  • 0 kudos
Shivaprasad
by Contributor
  • 68 Views
  • 3 replies
  • 0 kudos

Can we able to create materialized view in databricks using all purpose cluster

I was unable to create materialized view in databricks using all purpose cluster wanted to check do we need serverless cluster to create MV

  • 68 Views
  • 3 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @Shivaprasad ,Nope, all purpose cluster are not supported. Standalone materialized views can be created/refreshed either from a Unity Catalog-enabled Pro or Serverless SQL Warehouse, or from a notebook attached to Serverless General Compute. So we...

  • 0 kudos
2 More Replies
plankton
by New Contributor
  • 468 Views
  • 10 replies
  • 3 kudos

Resolved! R plots not rendering

Has anyone been experiencing the issue of R plots not rendering in notebooks, starting a few days ago?t's not related to splarkly or plotly, or specifc data types, or anything. For example in base R: plot(1:3, 5:7) calculates without error, but does ...

  • 468 Views
  • 10 replies
  • 3 kudos
Latest Reply
plankton
New Contributor
  • 3 kudos

Looks like the issue has been resolved. Thanks everyone for chiming in and thanks 'bricks for whatever you did to resolve this.Plankton out!

  • 3 kudos
9 More Replies
maikel
by Contributor III
  • 196 Views
  • 3 replies
  • 1 kudos

Resolved! Job tasks monitoring

Hello Community,We have a case in our project that we would like to solve in an elegant and scalable manner. As always, I would really appreciate your suggestions and experience.In short:We have a multi-step job consisting of 4 stages. In one of the ...

  • 196 Views
  • 3 replies
  • 1 kudos
Latest Reply
maikel
Contributor III
  • 1 kudos

@MoJaMa thanks a lot for these suggestions!

  • 1 kudos
2 More Replies
ccsalt
by New Contributor
  • 184 Views
  • 3 replies
  • 0 kudos

Inconsistent Cluster Log Persistence to Volume/S3 (stderr, stdout, log4j-active.log)

Saving logs from an all-purpose cluster to Volume or S3 is not consistent, because stderr, stdout, and log4j-active.log get overwritten when the cluster is restarted between minutes 01 and 59.Tested case:A job is configured to start every 20 minutes,...

  • 184 Views
  • 3 replies
  • 0 kudos
Latest Reply
aleksandra_ch
Databricks Employee
  • 0 kudos

Hi @ccsalt , This is a known limitation. Log rotation (renaming to log4j-YYYY-MM-DD-HH.log.gz) only happens on the hour boundary. The active log file log4j-active.log has always the same name and is overwritten if a cluster restart happens within one...

  • 0 kudos
2 More Replies
KSharmaDE
by New Contributor
  • 149 Views
  • 3 replies
  • 0 kudos

Import Data from Databricks to SQL Server

Hi our team wants to import data from Databricks catalog tables to SQL server.Is it possible to do so using SSIS package on SQL server ? what settings are required on Databricks tables?Suggest me some ETL tools and how to do it using SSIS

  • 149 Views
  • 3 replies
  • 0 kudos
Latest Reply
sudhirr
New Contributor II
  • 0 kudos

Yes, it is possible to integrate SSIS packages with Delta tables using JDBC/ODBC connectivity.Required on Databricks side:SQL Warehouse or interactive clusterJDBC/ODBC driverHostname, HTTP Path, Port 443, and PAT tokenProper table permissions in Unit...

  • 0 kudos
2 More Replies
loujiang
by New Contributor II
  • 140 Views
  • 1 replies
  • 0 kudos

Resolved! Databricks Runtime, Pyspark and Spark Versions

Hello, Dear community,I was go through the documentation of function from_xml here pyspark.sql.functions.from_xml — PySpark 4.1.2 documentation, it denotes that it is available in pyspark version higher than 4.0.0. Meanwhile, we have documentation fo...

  • 140 Views
  • 1 replies
  • 0 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 0 kudos

Hi @loujiang ,Databricks Runtime is not a vanilla Apache Spark distribution. DBR is built on top of a highly optimized version of Apache Spark, but also adds enhancements and additional components that substantially improve usability, performance, an...

  • 0 kudos
micheloh
by New Contributor
  • 181 Views
  • 4 replies
  • 1 kudos

Resolved! Create External Catalog when dbname has special characters

Hi experts,I'm having a problem when trying to create an external catalog with my PostgreSQL database. The connection is fine. But the database name that I want to connect has dashes and colon (eg. my-db-prod:all). When trying to connect with it, I a...

  • 181 Views
  • 4 replies
  • 1 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 1 kudos

Hi @micheloh, From what we’ve seen, this is currently a limitation of Lakehouse Federation foreign catalog creation rather than a problem with the connection itself. The PostgreSQL connection can succeed, but the database value used when creating the...

  • 1 kudos
3 More Replies
Pranav_1699
by New Contributor II
  • 151 Views
  • 1 replies
  • 1 kudos

Building a Spark Declarative Pipeline OSS with Apache Iceberg and AWS Glue Catalog

Hey everyone,I recently worked on building a modern financial data lakehouse using Spark Declarative Pipeline OSS (SDP OSS), Apache Iceberg, and AWS Glue Catalog.The blog covers:- Building declarative data pipelines with Spark- Using Apache Iceberg a...

Data Engineering
Spark Declarative Pipelines
  • 151 Views
  • 1 replies
  • 1 kudos
Latest Reply
sameer_yasser
New Contributor
  • 1 kudos

really cool

  • 1 kudos
Brahmareddy
by Esteemed Contributor
  • 223 Views
  • 1 replies
  • 2 kudos

Too Many Tools Can Slow Good Data Teams Down

A Small Thing I Keep Noticing in Data ProjectsLately, I have been thinking about something I have seen again and again in big data projects.At the start, everything feels manageable. One tool is used for ingestion. Another one is used for transformat...

  • 223 Views
  • 1 replies
  • 2 kudos
Latest Reply
sameer_yasser
New Contributor
  • 2 kudos

Honest advice teams should use Databricks for Data, BI, ML, and AI and close the tab. The depth of what's already there surprises most people once they actually dig in. The real problem isn't the tooling, it's that everyone chases the next shiny thin...

  • 2 kudos
JstelaBR
by Databricks Partner
  • 163 Views
  • 1 replies
  • 1 kudos

Is Databricks AI/BI Genie worth it if we already have Power BI or Tableau?

One thing that really changed how I think about BI platforms happened while I was working in a large enterprise environment heavily invested in Tableau.On paper, the environment looked mature: lots of dashboards, lots of business areas onboarded, and...

  • 163 Views
  • 1 replies
  • 1 kudos
Latest Reply
sameer_yasser
New Contributor
  • 1 kudos

Definitely yes and I'll back that with a real data point.Last week I ran a POC where I replicated our most complex Power BI dashboard in Genie. The original took our team about a month to build. Genie reproduced it in under 10 minutes with zero manua...

  • 1 kudos
Bank_Kirati
by New Contributor III
  • 88 Views
  • 1 replies
  • 0 kudos

Cross-region S3 reads suddenly fail with 400 Bad Request — eu-west-1 metastore to af-south-1 bucket

What changedA production daily job that has worked unchanged for ~8 months started failing on 2026-05-18 ~23:46 UTC. The notebook does a plain spark.read.json("s3://BUCKET/...") against a bucket in af-south-1. The metastore is in eu-west-1. Same code...

  • 88 Views
  • 1 replies
  • 0 kudos
Latest Reply
sameer_yasser
New Contributor
  • 0 kudos

Your debugging is really thorough and you've already done the hard work of isolating this. The 400 with an empty body (no proper S3 error code like InvalidArgument) on an opt-in region is almost always one thing: SigV4 signing region mismatch. af-sou...

  • 0 kudos
Rahul_Dhankhar
by New Contributor
  • 100 Views
  • 1 replies
  • 2 kudos

Seeking Volunteers with Lakehouse, Fabric, Databricks, or Snowflake Experience

Hello everyone,I am a doctoral researcher at the University of the Cumberlands and seeking 2–3 volunteers for a 20–25-minute field test for my dissertation research on Lakehouse platform adoption.The field test will be conducted over Zoom or Microsof...

  • 100 Views
  • 1 replies
  • 2 kudos
Latest Reply
sameer_yasser
New Contributor
  • 2 kudos

I am interested. Let me know. 

  • 2 kudos
ManojkMohan
by Honored Contributor II
  • 850 Views
  • 3 replies
  • 1 kudos

Resolved! ML Specific computes in data bricks free edition

Given free edition data bricks has serverless compute only is there any work around to chose ML Specific computes like belowis paying for it the only option ?

ManojkMohan_0-1754653497247.png
  • 850 Views
  • 3 replies
  • 1 kudos
Latest Reply
pjvi
New Contributor II
  • 1 kudos

Hi,In May 2026, I have tried with the environment v5 and still the same issue. However, looks like a Databricks employee answered short before, that in environment v4 it was available again, but not working for me, neither v4 nor v5.https://www.reddi...

  • 1 kudos
2 More Replies
Labels