cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

minhhung0507
by Valued Contributor
  • 3865 Views
  • 3 replies
  • 0 kudos

DeltaFileNotFoundException: [DELTA_TRUNCATED_TRANSACTION_LOG] Error in Streaming Table

I am encountering a recurring issue while working with Delta streaming tables in my system. The error message is as follows: com.databricks.sql.transaction.tahoe.DeltaFileNotFoundException: [DELTA_TRUNCATED_TRANSACTION_LOG] gs://cimb-prod-lakehouse/b...

minhhung0507_0-1739330700784.png minhhung0507_1-1739330749656.png
  • 3865 Views
  • 3 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

The issue you're encountering with the error DeltaFileNotFoundException: [DELTA_TRUNCATED_TRANSACTION_LOG] is related to Delta Lake's retention policy for logs and checkpoints, which manages the lifecycle of transaction log files and checkpoint files...

  • 0 kudos
2 More Replies
seanstachff
by New Contributor II
  • 3757 Views
  • 2 replies
  • 0 kudos

Databricks SQL Error outputting sesntive data to logs

Hi - I am using `from_json` with FAILFAST to correctly format some data using databricks SQL. However, this function can return the error "[MALFORMED_RECORD_IN_PARSING.WITHOUT_SUGGESTION] Malformed records are detected in record parsing" with the res...

  • 3757 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

You could use mode (default PERMISSIVE allows a mode for dealing with corrupt records during parsing. PERMISSIVE: when it meets a corrupted record, puts the malformed string into a field configured by columnNameOfCorruptRecord, and sets malformed fie...

  • 0 kudos
1 More Replies
dbuenosilva
by New Contributor
  • 4122 Views
  • 2 replies
  • 0 kudos

Auto loader from tables in Delta Share

Hello,I am trying to read delta table in delta shares shared from other environments.The pipeline runs okay; however, as the delta table is update in the source (delta share in GCP), the code below gets error, unless if I reset the checkpoint. I wond...

  • 4122 Views
  • 2 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

The error you are encountering—DeltaUnsupportedOperationException: [DELTA_SOURCE_TABLE_IGNORE_CHANGES]—occurs because your streaming job detects updates in the source Delta table, which is not supported for they type of source you have. Streaming tab...

  • 0 kudos
1 More Replies
pvaz
by New Contributor II
  • 4143 Views
  • 2 replies
  • 1 kudos

Performance issue when using structured streaming

Hi databricks community! Let me first apology for the long post.I'm implementing a system in databricks to read from a kafka stream into the bronze layer of a delta table. The idea is to do some operations on the data that is coming from kafka, mainl...

  • 4143 Views
  • 2 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

Have you tried using minPartitions Minimum number of partitions to read from Kafka. You can configure Spark to use an arbitrary minimum of partitions to read from Kafka using the minPartitions option. Normally Spark has a 1-1 mapping of Kafka topicPa...

  • 1 kudos
1 More Replies
Leszek
by Contributor
  • 9732 Views
  • 6 replies
  • 5 kudos

Resolved! Unity Catalog - Azure account console - how to access?

I'm trying to access account console in Azure but I only can see the list of workspaces and access them. I didn't find documentation about account console for Azure. Do you know how to access account console?

  • 9732 Views
  • 6 replies
  • 5 kudos
Latest Reply
vimalii
New Contributor II
  • 5 kudos

Hello @Leszek​ . Please tell me is it works for you ?Did you find the root cause ?I still don't understand why I should grant to myself some extra permissions if I already global administrator, owner of subscription, owner of databricks workspace but...

  • 5 kudos
5 More Replies
drii_cavalcanti
by New Contributor III
  • 1011 Views
  • 3 replies
  • 0 kudos

Databricks App with DAB

Hi All,I am trying to deploy a DBX APP via DAB, however source_code_path seems not to be parsed correctly to the app configuration.- dbx_dash/-- resources/---- app.yml-- src/---- app.yaml---- app.py-- databricks.ymlresources/app.yml:resources:apps: m...

  • 1011 Views
  • 3 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi Adriana, Have you adjusted the root_path in your databricks.yml Kindly add /Workspace and entire path to the root_path Thanks

  • 0 kudos
2 More Replies
p_romm
by New Contributor III
  • 3555 Views
  • 1 replies
  • 0 kudos

INVALID_HANDLE.SESSION_NOT_FOUND

We run several workflows and tasks parallel using serverless compute. In many different places of code we started to get errors as below. It looks like that when one task fails, every other that run at the same moment fails as well. After retry on on...

  • 3555 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi, The error  INVALID_HANDLE.SESSION_NOT_FOUND  https://docs.databricks.com/aws/en/error-messages/invalid-handle-error-class#session_not_foundis a handled error but the grpc errors are something where more improvements are being pushed in eve...

  • 0 kudos
Sega2
by New Contributor III
  • 3388 Views
  • 1 replies
  • 0 kudos

spark.sql makes debugger freeze

I have just created a simple bundle with databricks, and is using Databricks connect to debug locally. This is my script:from pyspark.sql import SparkSession, DataFrame def get_taxis(spark: SparkSession) -> DataFrame: return spark.read.table("samp...

Sega2_0-1739520074229.png Sega2_1-1739520103137.png
  • 3388 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Ensure that your Databricks Connect is properly set up and is using the correct version compatible with your cluster’s runtime. For VS Code, any mismatches between the installed databricks-connect Python package version and the cluster runtime could ...

  • 0 kudos
Vasu_Kumar_T
by New Contributor II
  • 3283 Views
  • 1 replies
  • 0 kudos

ODI 12C to Databricks equvivalent

Hello All,We are planning to convert from ODI12C to data bricks equivalentsWhat are the steps involved, what are the limitations in this case Thanks,Vasu Kumar T

  • 3283 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

Hi, These blogs can help you get an idea on the migration planning and insights. https://www.databricks.com/blog/how-migrate-your-oracle-plsql-code-databricks-lakehouse-platform https://www.databricks.com/blog/databricks-migration-strategy-lessons-le...

  • 0 kudos
alonisser
by Contributor II
  • 1066 Views
  • 2 replies
  • 1 kudos

Very long vacuum on s3

Since we've moved from azure to aws, a specific job has extremely long vacuum runs, is there a specific flag/configuration for the s3 storage that is needed to support faster vacuum.How can I research what's going on?Note, it's not ALL jobs, but a sp...

  • 1066 Views
  • 2 replies
  • 1 kudos
Latest Reply
NandiniN
Databricks Employee
  • 1 kudos

For faster Vacuum run performance, (1) avoid over-partitioned directories (2) avoid concurrent runs (during vacuum command run) (3) avoid enabling S3 versioning (As delta lake itself maintains the history) (4) run periodic “optimize” command,  (5) en...

  • 1 kudos
1 More Replies
biafch
by Contributor
  • 1396 Views
  • 1 replies
  • 0 kudos

spark.sql with CTEs (10 minutes) VS pyspark code + spark.sql (without CTE) (3 seconds), why?

Hello,I have two codes with the exact same outcome, one takes 7-10 minutes to load, and the other takes exactly 3 seconds, and I'm just trying to understand why:This takes 7-10 minutes:F_IntakeStepsPerDay = spark.sql(""" WITH BASE AS ( SELECT ...

  • 1396 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

The code is not a apple to apple comparison, and debugging with the help of Spark UI, plan can give a better understanding. But reviewing the code I can see in the PySpark implementation, you explicitly repartition the DataFrame (repartition("JobAppl...

  • 0 kudos
pradeepvatsvk
by New Contributor III
  • 614 Views
  • 1 replies
  • 0 kudos

Connecting Databricks to react application

Hi team , i want to connect my unity catalog tables to the react application, also we need to write some data back to the tables from react UI, for example  we are having some records which will be checked by the business people and the will approve ...

  • 614 Views
  • 1 replies
  • 0 kudos
Latest Reply
NandiniN
Databricks Employee
  • 0 kudos

A React application cannot directly interface with Unity Catalog for data operations. You can use Databricks APIs or JDBC connections to interact with your Unity Catalog tables. You can also check the REST API and use them in the applications https:/...

  • 0 kudos
minhhung0507
by Valued Contributor
  • 1350 Views
  • 1 replies
  • 0 kudos

Handling Streaming Query Hangs & Delta Upsert Failures in Multi-Table Jobs

Hi Databricks Experts,I'm encountering issues with my streaming jobs in Databricks and need some advice. I’ve implemented a custom streaming query listener to capture job status events and upsert them into a Delta table. However, the solution behaves...

  • 1350 Views
  • 1 replies
  • 0 kudos
Latest Reply
mmayorga
Databricks Employee
  • 0 kudos

Hello Hung, Working with streaming tables is always a challenge. Let's remember we are working with unbounded data so it's important to consider a few points: If you are working with Job, you can define your job cluster for each task. Consider the co...

  • 0 kudos
mehalrathod
by New Contributor II
  • 1185 Views
  • 2 replies
  • 0 kudos

Overwrite to a table taking 12+ hours

One of our Databricks notebook (using python, py-spark) has been running long for 12+ hours specifically on the overwrite command into a table. This notebook along with overwrite step has been completed within 10 mins in the past. But suddenly the ov...

  • 1185 Views
  • 2 replies
  • 0 kudos
Latest Reply
lingareddy_Alva
Honored Contributor III
  • 0 kudos

Hi @mehalrathod This sort of performance regression in Databricks (especially for overwrite) is usually caused by one or more of the following:Common Causes of Overwrite Slowness1. Delta Table History or File Explosion- If the target table is a Delta...

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels