cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

jllo
by New Contributor III
  • 3516 Views
  • 6 replies
  • 3 kudos

Azure Storage Account inside Databricks cannot enable soft-delete.

Hello,When deploying any databricks workspace inside Azure, the storage account inside the databricks managed resource group is unable to apply any changes, including enabling soft-delete. Is there a way to enable it?Best regards,Jon

  • 3516 Views
  • 6 replies
  • 3 kudos
Latest Reply
Debayan
Esteemed Contributor III
  • 3 kudos

Hi, Default storage withing default RG cannot be altered.

  • 3 kudos
5 More Replies
BenLambert
by Contributor
  • 1865 Views
  • 1 replies
  • 0 kudos

How to deal with deleted files in source directory in DLT?

We have a DLT pipeline that uses the autoloader to detect files added to a source storage bucket. It reads these updated files and adds new records to a bronze streaming table. However we would also like to automatically delete records from the bronz...

  • 1865 Views
  • 1 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

@Bennett Lambert​ :Yes, it is possible to automatically delete records from the bronze table when a source file is deleted, without doing a full refresh. One way to achieve this is by using the Change Data Capture (CDC) feature in Databricks Delta.CD...

  • 0 kudos
Kearon
by New Contributor III
  • 4076 Views
  • 11 replies
  • 0 kudos

Process batches in a streaming pipeline - identifying deletes

OK. So I think I'm probably missing the obvious and tying myself in knots here.Here is the scenario:batch datasets arrive in json format in an Azure data lakeeach batch is a complete set of "current" records (the complete table)these are processed us...

  • 4076 Views
  • 11 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @Kearon McNicol​ Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answer...

  • 0 kudos
10 More Replies
KKo
by Contributor III
  • 1775 Views
  • 3 replies
  • 3 kudos

delete and append in delta path

I am deleting data from curated path based on date column and appending staged data on it on each run, using below script. My fear is, just after the delete operation, if any network issue appeared and the job stopped before it appended the staged da...

  • 1775 Views
  • 3 replies
  • 3 kudos
Latest Reply
Kaniz_Fatma
Community Manager
  • 3 kudos

Hi @Kris Koirala​ , We haven’t heard from you since the last response from @Hubert Dudek​ , and I was checking back to see if his suggestions helped you. Or else, If you have any solution, please do share that with the community as it can be helpful ...

  • 3 kudos
2 More Replies
Aviral-Bhardwaj
by Esteemed Contributor III
  • 1136 Views
  • 0 replies
  • 31 kudos

Databricks New Runtime Version is Available Now  PySpark memory profiling- Memory profiling is now enabled for PySpark user-defined functions. This pr...

Databricks New Runtime Version is Available Now PySpark memory profiling- Memory profiling is now enabled for PySpark user-defined functions. This provides information on memory increment, memory usage, and number of occurrences for each line of code...

image
  • 1136 Views
  • 0 replies
  • 31 kudos
Sumeet_Dora
by New Contributor II
  • 1403 Views
  • 2 replies
  • 4 kudos

Resolved! Write mode features in Bigquey using Databricks notebook.

Currently using df.write.format("bigquery") ,Databricks only supports append and Overwrite modes in to writing Bigquery tables.Does Databricks has any option of executing the DMLs like Merge in to Bigquey using Databricks Notebooks.?

  • 1403 Views
  • 2 replies
  • 4 kudos
Latest Reply
mathan_pillai
Valued Contributor
  • 4 kudos

@Sumeet Dora​ , Unfortunately there is no direct "merge into" option for writing to Bigquery using Databricks notebook. You could write to an intermediate delta table using the "merge into" option in delta table. Then read from the delta table and pe...

  • 4 kudos
1 More Replies
Anonymous
by Not applicable
  • 1568 Views
  • 1 replies
  • 0 kudos

Auto-deletion of unused jobs

Is there a setting that will auto-cleanup/delete jobs that are of a certain age (say 90 days old for example)?

  • 1568 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Honored Contributor III
  • 0 kudos

It is not available natively in Databricks. But you can write an administration script that analyzes your jobs data and automatically cleans up the older jobs as needed. It would be easiest to do this with the jobs API. List your jobs to get all the ...

  • 0 kudos
User16826992666
by Valued Contributor
  • 1074 Views
  • 1 replies
  • 0 kudos

If I delete a table through the UI, does it also delete the underlying files?

I am using the UI in the workspace. I can use the Data tab to see my tables, then use the delete option through the UI. But I know there are underlying files that contain the tables data. Are these files also being deleted?

  • 1074 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Esteemed Contributor
  • 0 kudos

If the table is external the files are not deleted. For managed table, the underlying files get deleted. Essentially a "DROP TABLE" command is submitted under the hood.

  • 0 kudos
Labels