Community Articles

by smpa01 • Contributor

06-19-2025 7:43:02 AM

1861 Views
2 replies
0 kudos

Resolved! Databricks VS code extension to add cell title

I use the databricks extension in vs code for all my work. Is there any way for me to add a cell title from the extension itself?. There is no point in adding in the server version of this notebook cause when I sync the local to sever, it will overwr...

Community Articles

Reply

1861 Views
2 replies
0 kudos

06-19-2025 7:43:02 AM

View Replies

Latest Reply

smpa01
Contributor

07-18-2025 12:06:33 PM

0 kudos

One needs to use # DBTITLE 1,cell_title in a py file # COMMAND ---------- # DBTITLE 1,Title 1 from pyspark.sql import SparkSession from delta.tables import DeltaTable from pyspark.sql.functions import *

0 kudos

07-18-2025 12:06:33 PM

1 More Replies

by RiyazAliM • Honored Contributor

07-18-2025 5:39:45 AM

2859 Views
1 replies
4 kudos

The Databricks Python SDK

The Databricks SDK is a script (written in Python, in our case) which lets you control and automate actions on Databricks using the methods available in the WorkSpaceClient (more about this below).Why do we need Databricks SDK:- Automation: You can d...

Community Articles

Reply

2859 Views
1 replies
4 kudos

07-18-2025 5:39:45 AM

View Replies

Latest Reply

sridharplv
Valued Contributor II

07-18-2025 5:48:51 AM

4 kudos

Good Article @RiyazAliM.

4 kudos

07-18-2025 5:48:51 AM

by ilir_nuredini • Honored Contributor

07-16-2025 4:18:29 PM

2871 Views
2 replies
4 kudos

Apache 4.0

Missed the Apache Spark 4.0 release? It is not just a version bump, it is a whole new level for big data processing. Some of the highlights that really stood out to me:1. SQL just got way more powerful: reusable UDFs, scripting, session variables, an...

Community Articles

Reply

2871 Views
2 replies
4 kudos

07-16-2025 4:18:29 PM

View Replies

Latest Reply

Advika
Community Manager

07-17-2025 2:24:56 AM

4 kudos

Yeah, Spark 4.0 brings powerful enhancements while staying compatible with existing workloads.Thank you for putting this together and highlighting the key updates, @ilir_nuredini.

4 kudos

07-17-2025 2:24:56 AM

1 More Replies

by nathanielcooley • New Contributor II

06-10-2025 1:21:26 PM

3677 Views
4 replies
0 kudos

Data Modeling

Just got out of a session on Data Modeling using the Data Vault paradigm. Highly recommended to help think through complex data design. Look out for Data Modeling 101 for Data Lakehouse Demystified by Luan Medeiros.

Community Articles

Reply

3677 Views
4 replies
0 kudos

06-10-2025 1:21:26 PM

View Replies

Latest Reply

sridharplv
Valued Contributor II

07-12-2025 11:56:50 AM

0 kudos

Hi @BS_THE_ANALYST , please use this link with code for reference :https://www.databricks.com/blog/data-vault-best-practice-implementation-lakehouse

0 kudos

07-12-2025 11:56:50 AM

3 More Replies

by ilir_nuredini • Honored Contributor

07-11-2025 5:58:22 PM

1839 Views
0 replies
1 kudos

Databricks Asset Bundles

Why Should You Use Databricks Asset Bundles (DABs)?Without proper tooling, Data Engineering and Machine Learning projects can quickly become messy.That is why we recommend leveraging DABs to solve these common challenges:1. Collaboration:Without stru...

Community Articles

Reply

1839 Views
0 replies
1 kudos

07-11-2025 5:58:22 PM

by Brahmareddy • Esteemed Contributor II

08-12-2024 1:28:15 PM

14608 Views
8 replies
8 kudos

My Journey with Schema Management in Databricks

When I first started handling schema management in Databricks, I realized that a little bit of planning could save me a lot of headaches down the road. Here’s what I’ve learned and some simple tips that helped me manage schema changes effectively. On...

Community Articles

Reply

14608 Views
8 replies
8 kudos

08-12-2024 1:28:15 PM

View Replies

Latest Reply

Brahmareddy
Esteemed Contributor II

03-19-2025 8:26:19 PM

8 kudos

Haha, glad it made sense! Joao.Try it out, and if you run into any issues, just let me know. Always happy to help! And best friends? You got it!

8 kudos

03-19-2025 8:26:19 PM

7 More Replies

by CURIOUS_DE • Valued Contributor

06-10-2025 9:14:50 PM

2164 Views
2 replies
6 kudos

🔐 How Do I Prevent Users from Accidentally Deleting Tables in Unity Catalog? 🔐

Question:I have a role called dev-dataengineer with the following privileges on the catalog dap_catalog_dev:APPLY TAGCREATE FUNCTIONCREATE MATERIALIZED VIEWCREATE TABLECREATE VOLUMEEXECUTEREAD VOLUMEREFRESHSELECTUSE SCHEMAWRITE VOLUMEDespite this, u...

Community Articles

Reply

2164 Views
2 replies
6 kudos

06-10-2025 9:14:50 PM

View Replies

Latest Reply

nayan_wylde
Esteemed Contributor II

07-01-2025 12:23:23 PM

6 kudos

Managing assets in UC is always a overhead maintenance. We have this access controls in terraform codes and it is always hard to see what level of access is given to different personas in the org. We are building an audit dashboard for it.

6 kudos

07-01-2025 12:23:23 PM

1 More Replies

by shraddha_09 • New Contributor II

05-10-2025 6:58:04 AM

2347 Views
1 replies
1 kudos

Databricks Optimization Tips – What’s Your Secret?

When I first started working with Databricks, I was genuinely impressed by its potential. The seamless integration with Delta Lake, the power of PySpark, and the ability to process massive datasets at incredible speeds—it was truly impactful.Over tim...

Community Articles

Reply

2347 Views
1 replies
1 kudos

05-10-2025 6:58:04 AM

View Replies

Latest Reply

chanukya-pekala
Contributor III

06-13-2025 5:48:34 AM

1 kudos

1. Try to remove cache() and persist() in the dataframe operations in the code base.2. Fully avoid driver operations like collect() and take() - the information from the executors are brought back to driver, which is highly network i/o overhead.3. Av...

1 kudos

06-13-2025 5:48:34 AM

by prasannac • New Contributor

06-11-2025 8:04:06 AM

903 Views
0 replies
0 kudos

Request for a guest post

Hi, I hope you're doing well. My name is Prasanna. C, Digital Marketing Strategist at Express Analytics, a company that understands consumer behavior and provides analytics solutions and services to businesses. Express Analytics primarily offers...

Community Articles

Reply

903 Views
0 replies
0 kudos

06-11-2025 8:04:06 AM

by lucami • Contributor

06-10-2025 2:26:00 AM

2244 Views
2 replies
1 kudos

Automatic Liquid Clustering and PO

I spent some time to understand how to use automatic liquid clustering with dlt pipelines. Hope this can help you as well.Enable Predictive Optimization Use this code:# Enabling Automatic Liquid Clustering on a new table @dlt.table(cluster_by_auto=Tr...

Community Articles

Reply

2244 Views
2 replies
1 kudos

06-10-2025 2:26:00 AM

View Replies

Latest Reply

lucami
Contributor

06-10-2025 4:51:18 AM

1 kudos

Hi @Addy0_, thanks for sharing how to set it for existing table. Unfortunately, I think ALTER cannot be used with materialized view and streaming tables defined in dlt pipelines.I was looking for something similar to @dlt.table(cluster_by_auto=True, ...

1 kudos

06-10-2025 4:51:18 AM

1 More Replies

by thedatanerd • Contributor

06-10-2025 12:20:42 AM

1226 Views
0 replies
1 kudos

Databricks Data Classification

I encourage you to try out a new beta feature in Databricks called : Data Classification. It automatically classifies your catalog data and tag it with tags. Docs: https://docs.databricks.com/aws/en/lakehouse-monitoring/data-classification

Community Articles

Reply

1226 Views
0 replies
1 kudos

06-10-2025 12:20:42 AM

by xdx001 • New Contributor III

05-22-2025 7:00:16 AM

1117 Views
0 replies
1 kudos

Strong Databricks Fundamental - Gen Z

Why Databricks is the Future of Data Analytics for Gen ZIn the fast-paced world of data analytics, staying ahead of the curve is crucial. For Gen Z, who are digital natives and always on the lookout for the latest tech trends, understanding the diffe...

Community Articles

Reply

1117 Views
0 replies
1 kudos

05-22-2025 7:00:16 AM

by ThomazRossito • Contributor

04-14-2024 4:31:33 PM

4178 Views
1 replies
1 kudos

Post: Lakehouse Federation - Databricks

Lakehouse Federation - Databricks In the world of data, innovation is constant. And the most recent revolution comes with Lakehouse Federation, a fusion between data lakes and data warehouses, taking data manipulation to a new level. This advancement...

Community Articles

data engineer

Lakehouse

SQL Analytics

Reply

4178 Views
1 replies
1 kudos

04-14-2024 4:31:33 PM

View Replies

Latest Reply

Freshman
New Contributor III

05-05-2025 8:08:56 PM

1 kudos

Hey Quick Question, Can we use it for the production version ? We have application server as SQL server, we are planning to use lakehouse federation so we can bypass creating and maintaining 100 of workflows. as we a small dataset I am not too sure o...

1 kudos

05-05-2025 8:08:56 PM

by Shahram • New Contributor II

05-02-2025 6:13:34 AM

1287 Views
0 replies
1 kudos

Hub Star Modeling 2.0 for Medalion Architecture

Excited to share my latest publication on arXiv!“Hub Star Modeling 2.0 for Medallion Architecture” https://arxiv.org/abs/2504.08788This new version builds on the original Hub Star Modeling approach, published last year, and now tailored for the Meda...

Community Articles

Reply

1287 Views
0 replies
1 kudos

05-02-2025 6:13:34 AM

by genevive_mdonça • Databricks Employee

04-22-2025 8:32:20 AM

5058 Views
1 replies
6 kudos

Handling Complex Nested JSON in Databricks Using schemaHints

When I first got into managing schemas in Databricks, it took me a while to realize that putting in a little planning up front could save me a ton of headaches later on.I was working with these deeply nested, constantly changing JSON files. At first,...

Community Articles

Reply

5058 Views
1 replies
6 kudos

04-22-2025 8:32:20 AM

View Replies

Latest Reply

Advika
Community Manager

04-25-2025 6:08:56 AM

6 kudos

Great tip @genevive_mdonça! schemaHints help avoid issues with evolving JSON data, making data processing more reliable and easier to maintain. Thanks for sharing.

6 kudos

04-25-2025 6:08:56 AM

Databricks Community

Forum Posts

Resolved! Databricks VS code extension to add cell title

The Databricks Python SDK

Apache 4.0

Data Modeling

Databricks Asset Bundles

My Journey with Schema Management in Databricks

🔐 How Do I Prevent Users from Accidentally Deleting Tables in Unity Catalog? 🔐

Databricks Optimization Tips – What’s Your Secret?

Request for a guest post

Automatic Liquid Clustering and PO

Databricks Data Classification

Strong Databricks Fundamental - Gen Z

Post: Lakehouse Federation - Databricks

Hub Star Modeling 2.0 for Medalion Architecture

Handling Complex Nested JSON in Databricks Using schemaHints

How would you design a Spark pipeline to process b...

Refresh PBI Dataset is consuming unnecessary compu...

CI/CD on Databricks with Asset Bundles (DABs) and ...

Custom asset bundles file name

Designing a Cost-Efficient Databricks Lakehouse, P...