cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Phani1
by Databricks MVP
  • 71 Views
  • 1 replies
  • 0 kudos

Databricks Cost Estimation Template

Hi Databricks Team, Is there a standard Databricks cost estimation template(xl), sizing calculator, or TCO tool that allows us to provide the following inputs and derive an approximate monthly and annual platform cost:Source systems and their types (...

  • 71 Views
  • 1 replies
  • 0 kudos
Latest Reply
emma_s
Databricks Employee
  • 0 kudos

Hi, There isn't anything publicly available that I'm aware of. For this kind of complex migration I'd recommend working with your account team. As somebody who does Databricks sizing a lot, it's a nuanced art which I suspect is why we don't have any ...

  • 0 kudos
AanchalSoni
by Contributor
  • 84 Views
  • 4 replies
  • 1 kudos

Resolved! Checkpoint Location Error

 Hi!I'm facing an error related to Checkpoint whenever I try to display a dataframe using auto Loader in Databricks free edition. Please refer the screenshot. To combat this, I have to delete the checkpoint folder and then execute the display or writ...

  • 84 Views
  • 4 replies
  • 1 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 1 kudos

Hi @AanchalSoni, I can’t see the full history of your notebook, so I’m not sure of the exact cause. But the behaviour strongly suggests that an earlier version of the stream used complete mode against the same checkpointLocation, and that configurati...

  • 1 kudos
3 More Replies
AanchalSoni
by Contributor
  • 63 Views
  • 2 replies
  • 2 kudos

Resolved! NULL rows getting inserted in delta table- Schema mismatch

I'm trying to add _metadata column while reading a json file: from pyspark.sql.functions import colfrom pyspark.sql.types import StructType, StructField, LongType, TimestampTypedf_accounts_read = spark.readStream.format("cloudFiles").\    option("clo...

  • 63 Views
  • 2 replies
  • 2 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 2 kudos

Hi @AanchalSoni, Looking at the first snapshot, it appears the path in all three records points to the checkpoint location. The _metadata column isn’t the root cause here. The issue is that Autoloader is ingesting your checkpoint files as data. Becau...

  • 2 kudos
1 More Replies
GeKo
by Contributor
  • 84 Views
  • 2 replies
  • 1 kudos

how to reliably get the timestamp of the last write/delete activity on a unity catalog table

Hi,I'd like to know the best, most reliable, way to discover when the data in a UnityCatalog table was last modified (means 'added' or 'deleted'). Either python or SQL, doesn't matter....as long as I can use it within a scheduled job to run it period...

  • 84 Views
  • 2 replies
  • 1 kudos
Latest Reply
Louis_Frolio
Databricks Employee
  • 1 kudos

Greetings @GeKo  Good question. Short answer: treat the Delta transaction log as your source of truth. Every write, delete, or merge on a Unity Catalog table creates a commit with a timestamp and operation type. DESCRIBE HISTORY gives you access to a...

  • 1 kudos
1 More Replies
gkapri
by New Contributor II
  • 795 Views
  • 15 replies
  • 0 kudos

DLT table reading not performing file pruning on partition column

I have created bronze table and partitioned on processing date which is date column. In silver table i am putting filter on basis of processing date column to read last 2 days data but it is reading 37 million data but i have only 24722 in last 2 day...

gkapri_0-1770221522007.png
  • 795 Views
  • 15 replies
  • 0 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 0 kudos

Hi @Anish_2, Looking at your pipeline DAG, the issue is that you have two separate APPLY CHANGES INTO flows both targeting the same silver table (ag_vlc_hist), one from ag_swt_vlchistory_historical and one from ag_swt_vlchistory. When you define mult...

  • 0 kudos
14 More Replies
xwu
by New Contributor II
  • 172 Views
  • 1 replies
  • 1 kudos

Resolved! Iceberg native table Streaming in databricks

Hi ! I’ve been exploring the new Managed Iceberg tables integration and noticed a potential discrepancy between the documentation and actual behavior regarding streaming/incremental workloads.According to the official limitations, managed Iceberg tab...

xwu_5-1773939524300.png Capture d’écran 2026-03-19 175915.png
  • 172 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 1 kudos

Hi @xwu, Given that managed Iceberg and many of its features are still in Public Preview and explicitly "subject to change," you should treat this as a preview or advanced usage, not as a contractually supported workaround. In other words, it is not ...

  • 1 kudos
Saf4Databricks
by Contributor
  • 160 Views
  • 2 replies
  • 0 kudos

Resolved! Why this notebook is returning an error only when called by another notebook?

When I uncomment the last two lines of Called_Notebook.py and run it manually by itself, it correctly returns the output as:Status: SUCCESSCircle area: 50.26544But when I comment out the last two lines of Called_Notebook.py and run it from the Caller...

  • 160 Views
  • 2 replies
  • 0 kudos
Latest Reply
Saf4Databricks
Contributor
  • 0 kudos

Hi @pradeep_singh, your suggestion worked. Thank you for sharing your knowledge. Worth noticing that not including dbutils.notebook.exit(f"{Value to return}") raised the error in the exception block of the function inside the Called_Notebook - and th...

  • 0 kudos
1 More Replies
Anandhi-Sekaran
by New Contributor II
  • 115 Views
  • 3 replies
  • 1 kudos

Refresh streaming table error

Refresh streaming table sql succeeds for the first time.The subsequent  refresh statements fails with TABLE_OR_VIEW_NOT_FOUND error.The streaming table is still available in the same catalog and schema

  • 115 Views
  • 3 replies
  • 1 kudos
Latest Reply
Anandhi-Sekaran
New Contributor II
  • 1 kudos

HiHere is the query i use REFRESH STREAMING TABLE `cxxxxx`.`tgt_dev`.`ldp_csv` It is successful when i execute the first time.If i run the same query after 30 min, it throws error TABLE_OR_VIEW_NOT_FOUND  

  • 1 kudos
2 More Replies
Sega2
by New Contributor III
  • 187 Views
  • 2 replies
  • 0 kudos

Resolved! Creating a sync table from a workspace catalog to a project

We have a table in a workspace we would like to sync to a project. And we can fine choose the database project but we cannot see the database in the first section (Destination), see attached file.        

  • 187 Views
  • 2 replies
  • 0 kudos
Latest Reply
sv_databricks
Databricks Employee
  • 0 kudos

Hi @Sega2, Thanks for sharing the screenshot â€” this helps clarify what's happening. There are two likely reasons why you're not seeing your database in the Destination section: 1. The source table must be in Unity Catalog Synced tables only support ...

  • 0 kudos
1 More Replies
Saf4Databricks
by Contributor
  • 292 Views
  • 8 replies
  • 0 kudos

Alternative of spark.sql.globalTempDatabase

Question: Since I'm using Databricks Free Edition that uses only serverless cluster, I cannot use spark.sql.globalTempDatabase in my code below. What's an alternative solution for the Caller_Notebook below. Following error occurred in the second line...

  • 292 Views
  • 8 replies
  • 0 kudos
Latest Reply
balajij8
Contributor
  • 0 kudos

Hi @Saf4Databricks You can use the below.Called_Notebook:spark.range(5).toDF("MyCol").createOrReplaceTempView("MyView")Caller_Notebook:%run "./Notebook" display(table("MyView")) MyCol01234

  • 0 kudos
7 More Replies
Saf4Databricks
by Contributor
  • 283 Views
  • 6 replies
  • 1 kudos

Resolved! Why my calling notebook is not receiving the value of a variable in called notebook?

 Remarks: I though you can use %run command to make variables defined in one notebook available in another. The %run command executes the specified notebook inline within the current notebook's session, so all functions, variables, and DataFrames def...

  • 283 Views
  • 6 replies
  • 1 kudos
Latest Reply
Saf4Databricks
Contributor
  • 1 kudos

Hi @Ashwin_DSA, thank you for pointing out the cause of the error. This post can now be locked/closed.

  • 1 kudos
5 More Replies
Dhruv-22
by Contributor III
  • 765 Views
  • 10 replies
  • 0 kudos

Merge with schema evolution fails because of upper case columns

The following is a minimal reproducible example of what I'm facing right now.%sql CREATE OR REPLACE TABLE edw_nprd_aen.bronze.test_table ( id INT ); INSERT INTO edw_nprd_aen.bronze.test_table VALUES (1); SELECT * FROM edw_nprd_aen.bronze.test_tab...

Dhruv22_0-1768233514715.png Dhruv22_1-1768233551139.png Dhruv22_0-1768234077162.png
  • 765 Views
  • 10 replies
  • 0 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 0 kudos

Hi @Dhruv-22 , I did check with our product teams and they agree with what I wrote above, and that if you have a support contract to open a ticket about it. They are aware of this behavior and the workaround needed. However, they haven't seen this af...

  • 0 kudos
9 More Replies
Sumeet2
by New Contributor II
  • 210 Views
  • 1 replies
  • 1 kudos

Resolved! Connect to a delta table to django web app

  Hi !I am building a django web app, its in local for now. I am using databricks -sql-connector to run a simple query 'select * from catalog.schema.table_name' and display it on an html page. I keep getting an error that the view or table is not fou...

error.png query-run-proof.png
  • 210 Views
  • 1 replies
  • 1 kudos
Latest Reply
balajij8
Contributor
  • 1 kudos

You can use cursor.execute("SELECT * FROM scidstools.assetmanager.trucks") instead of cursor.execute('SELECT * FROM `scidstools.assetmanager.trucks`')info here

  • 1 kudos
loic
by Contributor
  • 480 Views
  • 4 replies
  • 3 kudos

Resolved! Transfer ownership of a Delta Share

Hello,I would like to clarify a point about Delta Share ownership.Indeed, there is something that is not clear in Databricks documentation.On one side, in the delta sharing page, it is written that "metastore admin" role is needed in order to change ...

  • 480 Views
  • 4 replies
  • 3 kudos
Latest Reply
SteveOstrowski
Databricks Employee
  • 3 kudos

Hi @loic, You can transfer ownership of a Delta Share using the ALTER SHARE ... OWNER TO SQL command. This is available in Databricks SQL and Databricks Runtime 11.3 LTS and above. USING SQL The syntax is straightforward: ALTER SHARE my_share OWNER T...

  • 3 kudos
3 More Replies
Labels