cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

loinguyen3182
by New Contributor
  • 31 Views
  • 0 replies
  • 0 kudos

Spark Streaming Error Listing in GCS

I have faced a problem about error listing of _delta_log, when the spark read stream with delta format in GCS. This is the full log of the issue:org.apache.spark.sql.streaming.StreamingQueryException: Failed to get result: java.io.IOException: Error ...

  • 31 Views
  • 0 replies
  • 0 kudos
noorbasha534
by Contributor III
  • 167 Views
  • 4 replies
  • 0 kudos

DQ anomaly detection : _quality_monitoring_summary table DDL

DearsDoes anyone have the DDL for _quality_monitoring_summary table?This is created by the DQ anomaly detector. Since the detector was trying to create a managed table which is not allowed in the environment I work, I am attempting to create this on ...

  • 167 Views
  • 4 replies
  • 0 kudos
Latest Reply
Yogesh_378691
New Contributor III
  • 0 kudos

Hi,The _quality_monitoring_summary table is an internal table created by the Data Quality Anomaly Detector in Databricks Lakehouse Monitoring. Unfortunately, the full DDL is not publicly documented in detail, and trying to manually create it can lead...

  • 0 kudos
3 More Replies
Ramki
by New Contributor
  • 109 Views
  • 1 replies
  • 0 kudos

Lakeflow clarification

Are there options to modify the streaming table after it has been created by the Lakeflow pipeline? In the use case I'm trying to solve, I need to add delta.enableIcebergCompatV2 and delta.universalFormat.enabledFormats to the target streaming table....

  • 109 Views
  • 1 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor II
  • 0 kudos

Hi @Ramki Yes, you can modify a streaming table created by a LakeFlow pipeline, especially when the pipeline is in triggered mode (not running continuously).In your case, you want to add the following Delta table properties: TBLPROPERTIES ( 'delta....

  • 0 kudos
Sainath368
by New Contributor II
  • 178 Views
  • 1 replies
  • 0 kudos

Data Skipping- Partitioned tables

Hi all,I have a question- how can we modify delta.dataSkippingStatsColumns and compute statistics for a partitioned delta table in Databricks? I want to understand the process and best practices for changing this setting and ensuring accurate statist...

  • 178 Views
  • 1 replies
  • 0 kudos
Latest Reply
paolajara
Databricks Employee
  • 0 kudos

Hi, delta.dataSkippingStatsColumns specifies a coma-separated list of column names used by Delta Lake to collect statistics. It will improve the performance by skipping those columns since it will supersede the default behavior of analyzing the first...

  • 0 kudos
lmu
by New Contributor II
  • 737 Views
  • 11 replies
  • 3 kudos

Resolved! Write on External Table with Row Level Security fails

Hey,we are experiencing issues with writing to external tables when using the Unity Catalogue and Row Level Security.As soon as we stop using the serverless compute instance, we receive the following error for writing (Overwrite, append and upsert):E...

  • 737 Views
  • 11 replies
  • 3 kudos
Latest Reply
lmu
New Contributor II
  • 3 kudos

After further testing, it was found that the dedicated access mode (formerly single user) either does not work or exhibits strange behaviour. In one scenario, the 16.4 cluster with dedicated access mode could write in append mode but not overwrite, a...

  • 3 kudos
10 More Replies
hpant
by New Contributor III
  • 534 Views
  • 2 replies
  • 1 kudos

Is it possible to create external volume using databricks asset bundle?

Is it possible to create external volume using databricks asset bundle? I have this code from databricks.yml file which is working perfectly fine for manged volume:    resources:      volumes:        bronze_checkpoints_volume:          catalog_name: ...

  • 534 Views
  • 2 replies
  • 1 kudos
Latest Reply
nayan_wylde
Contributor
  • 1 kudos

bundle:name: my_azure_volume_bundleresources:volumes:my_external_volume:catalog_name: mainschema_name: my_schemaname: my_external_volumevolume_type: EXTERNALstorage_location: abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/<path>...

  • 1 kudos
1 More Replies
Nasd_
by New Contributor
  • 290 Views
  • 2 replies
  • 0 kudos

Resolved! Accessing DeltaLog and OptimisticTransaction from PySpark

Hi community,I'm exploring ways to perform low-level, programmatic operations on Delta tables directly from a PySpark environment.The standard delta.tables.DeltaTable Python API is excellent for high-level DML, but it seems to abstract away the core ...

  • 290 Views
  • 2 replies
  • 0 kudos
Latest Reply
Nasd_
New Contributor
  • 0 kudos

Hi Lou,Thank you so much for your detailed and insightful response. It really helped clarify the intended architecture and the different APIs (DeltaLog vs. DeltaTable).I'm trying to programmatically access the low-level Delta Lake APIs by passing thr...

  • 0 kudos
1 More Replies
aravind-ey
by New Contributor II
  • 3726 Views
  • 18 replies
  • 3 kudos

vocareum lab access

Hi I am doing a data engineering course in databricks(Partner labs) and would like to have access to vocareum workspace to practice using the demo sessions.can you please help me to get the access to this workspace?regards,Aravind

  • 3726 Views
  • 18 replies
  • 3 kudos
Latest Reply
VikramGanapathi
New Contributor II
  • 3 kudos

@EmmaPotthastBJS @Advika Thank you for the posts and do you have a ETA of when the courses will be available?I mean these ones  AI/BI for Data Analysts and SQL Analytics on Databricks

  • 3 kudos
17 More Replies
KristiLogos
by Contributor
  • 733 Views
  • 4 replies
  • 0 kudos

Simba JDBC Exception When Querying Tables via BigQuery Databricks Connection

Hello, I have a federated connection to BigQuery that has GA events tables for each of our projects. I'm trying to query each daily table which contains about 400,000 each day, and load into another table, but I keep seeig this Simba JDBC exception. ...

  • 733 Views
  • 4 replies
  • 0 kudos
Latest Reply
tsekityam_2
New Contributor
  • 0 kudos

I also have this issue, and I resolved it by cast all the records columns in bigquery to string before I dump the data.I first create a view likecreate view xxx as select string_1, string_2, string_3, to_json_string(record_1) as record_1, to_json_s...

  • 0 kudos
3 More Replies
HoussemBL
by New Contributor III
  • 1326 Views
  • 10 replies
  • 1 kudos

DLT Pipeline & Automatic Liquid Clustering Syntax

Hi everyone,I noticed Databricks recently released the automatic liquid clustering feature, which looks very promising. I'm currently implementing a DLT pipeline and would like to leverage this new functionality.However, I'm having trouble figuring o...

  • 1326 Views
  • 10 replies
  • 1 kudos
Latest Reply
Alex006
Contributor
  • 1 kudos

Same issue here. I have activated PO on the specific schema where the materialized view resides per these instructions https://docs.databricks.com/aws/en/optimizations/predictive-optimization#check-whether-predictive-optimization-is-enabled- Doesn't ...

  • 1 kudos
9 More Replies
Sainath368
by New Contributor II
  • 159 Views
  • 2 replies
  • 2 kudos

Is it ok to Run ANALYZE TABLE COMPUTE DELTA STATISTICS While data is loading into a Delta Table?

Hi all,I have a doubt regarding the best practices for running  ANALYZE TABLE table_name COMPUTE DELTA STATISTICS on a Delta table. Is it recommended to execute this command while data is being loaded into the table, or should it be run afterward? Ad...

  • 159 Views
  • 2 replies
  • 2 kudos
Latest Reply
nikhilj0421
Databricks Employee
  • 2 kudos

ANALYZE TABLE is a read-only operation. It reads the data to compute statistics but does not modify the data. Running ANALYZE TABLE COMPUTE DELTA STATISTICS while data is still being loaded into a Delta table is generally not recommended. The ANALYZE...

  • 2 kudos
1 More Replies
JameDavi_51481
by Contributor
  • 1884 Views
  • 3 replies
  • 2 kudos

Resolved! updates on Bring Your Own Lineage (BYOL)?

One of the most exciting things in recent roadmap discussions was the idea of BYOL, so we could import external lineage into Unity Catalog and make it really useful for understanding where our data flows. We're planning some investments for the next ...

  • 1884 Views
  • 3 replies
  • 2 kudos
Latest Reply
Louis_Hausle
New Contributor II
  • 2 kudos

Hello all. Any updates on BYOL and any documentation available?

  • 2 kudos
2 More Replies
Ranga_naik1180
by New Contributor II
  • 7563 Views
  • 7 replies
  • 5 kudos

Resolved! Delta Live table

Hi All,I'm working on a databricks delta live table(DLT) pipe line where we receive daily fully sanshot csv files in azure cloud storage .These files contain HR data (eg.employee file) and i'm using autoloader to ingest them into bronze layer DLT tab...

  • 7563 Views
  • 7 replies
  • 5 kudos
Latest Reply
nikhilj0421
Databricks Employee
  • 5 kudos

Hi @Ranga_naik1180, There is no need to create an intermediate view in SQL. You can directly read the change data feed from silver into the gold table. You can use the code something like below: CREATE STREAMING LIVE TABLE gold_table AS SELECT * FRO...

  • 5 kudos
6 More Replies
oneill
by New Contributor II
  • 976 Views
  • 3 replies
  • 0 kudos

SQL - Dynamic overwrite + overwrite schema

Hello,Let say we have an empty table S that represents the schema we want to keepABCDEWe have another table T partionned by column A with a schema that depends on the file we have load into. Say :ABCF1b1c1f12b2c2f2Now to make T having the same schema...

  • 976 Views
  • 3 replies
  • 0 kudos
Latest Reply
oneill
New Contributor II
  • 0 kudos

Hi, thanks for the reply. I've already looked at the documentation on this point, which actually states that dynamic overwrite doesn't work with schema overwrite, while the instructions described above seem to indicate the opposite.

  • 0 kudos
2 More Replies
Labels