cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

FabriceDeseyn
by Contributor
  • 715 Views
  • 1 replies
  • 0 kudos

Bug - data profile internal code

Hi I am not sure how to post a potential bug but I stumble upon the following issue on DBR 13.2.The same code 'sometimes' works on DBR 12.2 LTS. But if I do it on a real table, this issue always occurs. 

FabriceDeseyn_0-1690530658137.png
  • 715 Views
  • 1 replies
  • 0 kudos
Latest Reply
mathan_pillai
Valued Contributor
  • 0 kudos

Tried reproducing the issue on DBR 13.2, but unable to. find attached the screenshot How intermittently is the issue occurring ?  

  • 0 kudos
Remit
by New Contributor III
  • 2744 Views
  • 1 replies
  • 0 kudos

Resolved! Merge error in streaming case

I have a streaming case, where i stream from 2 sources: source1 and source2. I write to seperate streams to pick the data up from the landing area (step1). then i write 2 extra streams to apply some tranformations in order to give them the same schem...

Data Engineering
MERGE
streaming
  • 2744 Views
  • 1 replies
  • 0 kudos
Latest Reply
Remit
New Contributor III
  • 0 kudos

Solved the problem by changing the cluster settings. The whole thing works when disabling Photon Acceleration...

  • 0 kudos
TimW
by New Contributor
  • 3065 Views
  • 2 replies
  • 1 kudos

Resolved! Help - Can't create table from tutorial. Is my setup wrong?

Trying out databricks for the first time and followed the Get Started steps. I managed to successfully create a cluster and ran the simple sql tutorial to query data from a notebook. However, got the following error:Query:DROP TABLE IF EXISTS diamond...

  • 3065 Views
  • 2 replies
  • 1 kudos
Latest Reply
Scott_in_Zurich
New Contributor III
  • 1 kudos

Adding 'dbfs:' got me past that error. Now onto debugging a PARSE SYNTAX error....

  • 1 kudos
1 More Replies
nyck33
by New Contributor II
  • 3447 Views
  • 0 replies
  • 0 kudos

snowflake python connector import error

```--------------------------------------------------------------------------- ImportError Traceback (most recent call last) File <command-1961894174266859>:1 ----> 1 con = snowflake.connector.connect( 2 user=USER, 3 password=SNOWSQL_PWD, 4 account=A...

  • 3447 Views
  • 0 replies
  • 0 kudos
mwoods
by New Contributor III
  • 1862 Views
  • 2 replies
  • 2 kudos

Delta Live Tables error with Kafka SSL

We have a spark streaming job that consumes data from a Kafka topic and writes out to delta tables in Unity Catalog.Looking to refactor it to use Delta Live Tables, but it appears that it is not possible at present to have a DLT Pipeline that can acc...

  • 1862 Views
  • 2 replies
  • 2 kudos
Latest Reply
gabriall
New Contributor II
  • 2 kudos

Indeed its already patched. you just have to configure your pipeline on the "preview" channel.

  • 2 kudos
1 More Replies
Juju
by New Contributor II
  • 10918 Views
  • 1 replies
  • 1 kudos

DeltaFileNotFoundException: No file found in the directory (sudden task failure)

Hi all,I am currently running a job that will upsert a table by reading from delta change data feed from my silver table. Here is the relevent snippet of code:  rds_changes = spark.read.format("delta") \ .option("readChangeFeed", "true") \ .optio...

  • 10918 Views
  • 1 replies
  • 1 kudos
Latest Reply
" src="" />
This widget could not be displayed.
This widget could not be displayed.
This widget could not be displayed.
  • 1 kudos

This widget could not be displayed.
Hi all,I am currently running a job that will upsert a table by reading from delta change data feed from my silver table. Here is the relevent snippet of code:  rds_changes = spark.read.format("delta") \ .option("readChangeFeed", "true") \ .optio...

This widget could not be displayed.
  • 1 kudos
This widget could not be displayed.
Noosphera
by New Contributor III
  • 8889 Views
  • 0 replies
  • 0 kudos

Resolved! How to reinstantiate the Cloudformation template for AWS

Hi Everyone!I am new to Databricks, and had chosen to use the Cloudformation template to create my AWS Workspace. I regretfully must admit I felt creative in the process and varied the suggested stackname and that must have created errors which ended...

Data Engineering
AWS
Cloudformation template
Unity Catalog
  • 8889 Views
  • 0 replies
  • 0 kudos
Erik
by Valued Contributor II
  • 1832 Views
  • 0 replies
  • 0 kudos

Why not enable "decommissioning" in spark?

You can enable "decommissioning" in spark, which causes it to remove work from a worker when it gets a notification from the cloud that the instance goes away (e.g. SPOT instances). This is dissabled by default, but it seems like such a no-brainer to...

  • 1832 Views
  • 0 replies
  • 0 kudos
jimbo
by New Contributor II
  • 7359 Views
  • 0 replies
  • 0 kudos

Pyspark datatype missing microsecond precision last three SSS: h:mm:ss:SSSSSS - datetype

Hi all,We are having issues with the datetype data type in spark when ingesting files.Effectively the source data has 6 microseconds worth of precision but the most we can extract from the datatype is three. For example 12:03:23.123, but what is requ...

Data Engineering
pyspark datetype precision missing
  • 7359 Views
  • 0 replies
  • 0 kudos
Sangram
by New Contributor III
  • 1793 Views
  • 0 replies
  • 0 kudos

Unable to mount ADLS gen2 to databricks file system

I am unable to mount ADLS gen2 storage path into databricks storage path.It is throwing error as unsupported azure scheme:abfssMay I know the reason.Below are the steps that I followed: -1. create a service principal2. store the service principal's s...

Sangram_0-1700274947304.png
  • 1793 Views
  • 0 replies
  • 0 kudos
Rdipak
by New Contributor II
  • 1062 Views
  • 2 replies
  • 0 kudos

Delta live table blocks pipeline autoloader rate limit

I have created a ETL pipeline with DLT. My first step is to ingest into raw delta table using autoloader file notification. when I have 20k notification pipe line run well across all stages. But when we have surge in number of messages pipeline waits...

  • 1062 Views
  • 2 replies
  • 0 kudos
Latest Reply
kulkpd
Contributor
  • 0 kudos

Did you try following options:.option('cloudFiles.maxFilesPerTrigger', 10000) or maxBytesPerTrigger ?

  • 0 kudos
1 More Replies
kulkpd
by Contributor
  • 1797 Views
  • 2 replies
  • 2 kudos

Resolved! Autoloader with filenotification

I am using DLT with filenotification and DLT job is just fetching 1 notification from SQS queue at a time. My pipeline is expected to process 500K notifications per day but it running hours behind. Any recommendations?spark.readStream.format("cloudFi...

  • 1797 Views
  • 2 replies
  • 2 kudos
Latest Reply
Rdipak
New Contributor II
  • 2 kudos

Can you set this value to higher number and trycloudFiles.fetchParallelism its 1 by default

  • 2 kudos
1 More Replies
AndrewSilver
by New Contributor II
  • 863 Views
  • 1 replies
  • 1 kudos

Uncertainty on Databricks job variables: {{run_id}}, {{parent_run_id}}.

In Azure's Databricks jobs, {{run_id}} and {{parent_run_id}} serve as variables. In jobs with multiple tasks, {{run_id}} aligns with task_run_id, while {{parent_run_id}} matches job_run_id. In single-task jobs, {{parent_run_id}} aligns with task_run_...

  • 863 Views
  • 1 replies
  • 1 kudos
Latest Reply
kulkpd
Contributor
  • 1 kudos

I am using job with single task and multiple retry.Upon job retry the run_id get changed, I tried to using  {{parent_run_id}} but never worked so switched to val parentRunId = dbutils.notebook.getContext.tags("jobRunOriginalAttempt")

  • 1 kudos
Shawn_Eary
by Contributor
  • 1226 Views
  • 0 replies
  • 0 kudos

Streaming Delta Live Tables Cluster Management

If I use code like this: -- 8:56 -- https://youtu.be/PIFL7W3DmaY?si=MWDSiC_bftoCh4sH&t=536 CREATE STREAMING LIVE TABLE report AS SELECT * FROM cloud_files("/mydata", "json") To create a STREAMING Delta Live Table though the Workflows Section of...

  • 1226 Views
  • 0 replies
  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels