cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Hubert-Dudek
by Databricks MVP
  • 92 Views
  • 1 replies
  • 2 kudos

Databricks Advent Calendar 2025 #18

Automatic file retention in the autoloader is one of my favourite new features of 2025. Automatically move cloud files to cold storage or just delete.

2025_18.png
  • 92 Views
  • 1 replies
  • 2 kudos
Latest Reply
szymon_dybczak
Esteemed Contributor III
  • 2 kudos

Thanks for sharing @Hubert-Dudek ! That's a really great feature. It simplified a lot data maintenance process at one of my clients

  • 2 kudos
Hubert-Dudek
by Databricks MVP
  • 91 Views
  • 0 replies
  • 0 kudos

Databricks Advent Calendar 2025 #16

For many data engineers who love PySpark, the most significant improvement of 2025 was the addition of merge to the dataframe API, so no more Delta library or SQL is needed to perform MERGE. p.s. I still prefer SQL MERGE inside spark.sql()

2025_16.png
  • 91 Views
  • 0 replies
  • 0 kudos
Hubert-Dudek
by Databricks MVP
  • 87 Views
  • 0 replies
  • 2 kudos

Databricks Advent Calendar 2025 #15

New Lakakebase experience is a game-changer for transactional databases. That functionality is fantastic. Autoscaling to zero makes it really cost-effective. Do you need to deploy to prod? Just branch the production database to the release branch, an...

2025_15.png
  • 87 Views
  • 0 replies
  • 2 kudos
Hubert-Dudek
by Databricks MVP
  • 104 Views
  • 0 replies
  • 0 kudos

Databricks Advent Calendar 2025 #14

Ingestion from SharePoint is now available directly in PySpark. Just define a connection and use spark-read or, even better, spark-readStream with an autoloader. Just specify the file type and options for that file (pdf, csv, Excel, etc.)

2025_14.png
  • 104 Views
  • 0 replies
  • 0 kudos
Hubert-Dudek
by Databricks MVP
  • 123 Views
  • 0 replies
  • 2 kudos

Databricks Advent Calendar 2025 #10

Databricks goes native on Excel. You can now ingest + query .xls/.xlsx directly in Databricks (SQL + PySpark, batch and streaming), with auto schema/type inference, sheet + cell-range targeting, and evaluated formulas, no extra libraries anymore.

2025_10.png
  • 123 Views
  • 0 replies
  • 2 kudos
Hubert-Dudek
by Databricks MVP
  • 111 Views
  • 0 replies
  • 2 kudos

Databricks Advent Calendar 2025 #9

Tags, whether manually assigned or automatically assigned by the “data classification” service, can be protected using policies. Column masking can automatically mask columns with a given tag for all except some with elevated access.

2025_9.png
  • 111 Views
  • 0 replies
  • 2 kudos
Hubert-Dudek
by Databricks MVP
  • 126 Views
  • 0 replies
  • 2 kudos

Databricks Advent Calendar 2025 #7

Imagine all a data engineer or analyst needs to do to read from a REST API is use spark.read(), no direct request calls, no manual JSON parsing - just spark .read. That’s the power of a custom Spark Data Source. Soon, we will see a surge of open-sour...

2025_7.png 2025_7.png
  • 126 Views
  • 0 replies
  • 2 kudos
Hubert-Dudek
by Databricks MVP
  • 142 Views
  • 0 replies
  • 2 kudos

Databricks Advent Calendar 2025 #6

DBX is one of the most crucial projects of dblabs this year, and we can expect that more and more great checks from it will be supported natively in databricks. More about dbx on https://databrickslabs.github.io/dqx/

2025_6.png
  • 142 Views
  • 0 replies
  • 2 kudos
Labels