cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 
Data + AI Summit 2024 - Data Engineering & Streaming

Forum Posts

Sudic29
by Visitor
  • 6 Views
  • 0 replies
  • 0 kudos

Bookmark in pdf

I am creating a pdf using pyspark and trying to make bookmarks for each table in the pages. All the bookmarks end up pointing to the first table in the first page. Please help me out here.

  • 6 Views
  • 0 replies
  • 0 kudos
ameet9257
by New Contributor II
  • 11 Views
  • 1 replies
  • 0 kudos

Cloning of Workflow from One env to different env using Job API

Hi Team,One of my team members recently shared one requirement: he wants to migrate the 10 Workflows from the sandbox to the dev environment to run his model in dev env.I wanted to move all these workflows in an automated way and one of the solutions...

ameet9257_0-1732126423665.png ameet9257_1-1732126508104.png ameet9257_2-1732126684331.png
  • 11 Views
  • 1 replies
  • 0 kudos
Latest Reply
Stefan-Koch
Contributor II
  • 0 kudos

Hi AmetDatabricks Asset Bundles are designed precisely for the requirements you have.https://docs.databricks.com/en/dev-tools/bundles/index.htmlYou can also transfer existing jobs that were created manually into a bundle. This works like this, for ex...

  • 0 kudos
jorperort
by New Contributor III
  • 4630 Views
  • 6 replies
  • 1 kudos

Resolved! [Databricks Assets Bundles] no deployment state

Good morning, I'm trying to run: databricks bundle run --debug -t dev integration_tests_job My bundle looks: bundle: name: x include: - ./resources/*.yml targets: dev: mode: development default: true workspace: host: x r...

Data Engineering
Databricks Assets Bundles
Deployment Error
pid=265687
  • 4630 Views
  • 6 replies
  • 1 kudos
Latest Reply
jtberman
New Contributor
  • 1 kudos

Hello, Reopening this ticket in hopes that either of you had some luck in resolving your bug.  I am currently facing the same issue where I can deploy an asset bundle via the local CLI without issue (by deploy I mean the bundle code is written to my ...

  • 1 kudos
5 More Replies
Andrewcon
by New Contributor II
  • 1841 Views
  • 1 replies
  • 1 kudos

Delta tables and YOLO computer vision tasks

 Hi all,I would really appreciate if someone could help me out. I feel it’s both a data engineering and ML question.One thing we use at wo is YOLO for object detection. I’ve managed to run YOLO by loading data from the blob storage, but I’ve seen tha...

Data Engineering
computer vision
Delta table
YOLO
  • 1841 Views
  • 1 replies
  • 1 kudos
Latest Reply
jnap
Visitor
  • 1 kudos

I am also looking for an answer to this question. Did you manage to find a solution @Andrewcon ?

  • 1 kudos
NehaR
by New Contributor III
  • 13 Views
  • 2 replies
  • 1 kudos

Is there any option in databricks to estimate cost of a query before execution

Hi Team,I want to check if there is any option in data bricks which can help to estimate cost of a query before execution?I mean calculate DBU before actual query execution based on the logical plan? Regards 

  • 13 Views
  • 2 replies
  • 1 kudos
Latest Reply
NehaR
New Contributor III
  • 1 kudos

Is there any way to track the progress or ETA?Do we have access to ideas portal? Where can we search this reference number DB-I-5730? 

  • 1 kudos
1 More Replies
184754
by New Contributor
  • 25 Views
  • 1 replies
  • 0 kudos

Table Trigger - Too many logfiles

Hi, we have implemented a job that runs on a trigger of a table update. The job worked perfectly, until the source table now have accumulated too many log files and the job isn't running anymore. Only the error message below:Storage location /abcd/_d...

  • 25 Views
  • 1 replies
  • 0 kudos
Latest Reply
radothede
Contributor II
  • 0 kudos

Hi @184754 Interesting topic, as the docs says:"Log files are deleted automatically and asynchronously after checkpoint operations and are not governed by VACUUM. While the default retention period of log files is 30 days, running VACUUM on a table r...

  • 0 kudos
DataGeek_JT
by New Contributor II
  • 1032 Views
  • 2 replies
  • 0 kudos

Is it possible to use Liquid Clustering on Delta Live Tables / Materialised Views?

Is it possible to use Liquid Clustering on Delta Live Tables? If it is available what is the Python syntax for adding liquid clustering to a Delta Live Table / Materialised view please? 

  • 1032 Views
  • 2 replies
  • 0 kudos
Latest Reply
kerem
New Contributor II
  • 0 kudos

Hi @amr, materialised views are not tables, they are views. Liquid clustering is not supported on views so it will throw [EXPECT_TABLE_NOT_VIEW.NO_ALTERNATIVE] error. Unfortunately it will be the same case for the "optimize" command as well. 

  • 0 kudos
1 More Replies
FabianGutierrez
by New Contributor III
  • 143 Views
  • 9 replies
  • 1 kudos

My DABS CLI Deploy call not generating a .tfstate file

Hi Community,I'm running into an issue, when executing Databricks CLI Bundle Deploy I dont get the Terraform State file (.tfstate). I know that I should get one but even when defining the state_apth on my YAML (.yml) DABS file I still do not get it.D...

FabianGutierrez_0-1731932526298.png
  • 143 Views
  • 9 replies
  • 1 kudos
Latest Reply
FabianGutierrez
New Contributor III
  • 1 kudos

Forgot to also share this print screen of the last section in the logs. Somehow the State file keeps getting ignored (not found) so how can the deployment still take place I wonder.  

  • 1 kudos
8 More Replies
BAZA
by New Contributor II
  • 7459 Views
  • 12 replies
  • 0 kudos

Invisible empty spaces when reading .csv files

When importing a .csv file with leading and/or trailing empty spaces around the separators, the output results in strings that appear to be trimmed on the output table or when using .display() but are not actually trimmed.It is possible to identify t...

  • 7459 Views
  • 12 replies
  • 0 kudos
Latest Reply
sallytomato
  • 0 kudos

I’ve found that investing in high-quality print services like GoPrint really makes a difference in ensuring your materials match perfectly. Also, it's good practice to always test with smaller prints first, like business cards or brochures, before go...

  • 0 kudos
11 More Replies
joeyslaptop
by New Contributor II
  • 51 Views
  • 1 replies
  • 0 kudos

How do I use DataBricks SQL query to convert a field value % back into a wildcard?

Hi.  If I've posted to the wrong area, please let me know.I am using SQL to join two tables.  One table has the wildcard '%' stored as text/string/varchar.  I need to join the value of TableA.column1 to TableB.column1 based on the wildcard in the str...

  • 51 Views
  • 1 replies
  • 0 kudos
Latest Reply
JAHNAVI
Databricks Employee
  • 0 kudos

Hi,Could you please try the query below and let me know if it meets your requirements? SELECT * FROM TableA A LEFT JOIN TableB B ON A.Column1 LIKE REPLACE(B.Column1, '%', '%%')Replace helps us in treating the %' stored in TableB.Column1 as a wildcar...

  • 0 kudos
swetha
by New Contributor III
  • 2885 Views
  • 4 replies
  • 1 kudos

Error: no streaming listener attached to the spark app is the error we are observing post accessing streaming statistics API. Please help us with this issue ASAP. Thanks.

Issue: Spark structured streaming applicationAfter adding the listener jar file in the cluster init script, the listener is working (From what I see in the stdout/log4j logs)But when I try to hit the 'Content-Type: application/json' http://host:port/...

  • 2885 Views
  • 4 replies
  • 1 kudos
Latest Reply
INJUSTIC
Visitor
  • 1 kudos

Have you found the solution? Thanks

  • 1 kudos
3 More Replies
swetha
by New Contributor III
  • 2459 Views
  • 3 replies
  • 1 kudos

I am unable to attach a streaming listener to a spark streaming job. Error: no streaming listener attached to the spark application is the error we are observing post accessing streaming statistics API. Please help us with this issue ASAP. Thanks.

Issue:After adding the listener jar file in the cluster init script, the listener is working (From what I see in the stdout/log4j logs)But when I try to hit the 'Content-Type: application/json' http://host:port/api/v1/applications/app-id/streaming/st...

  • 2459 Views
  • 3 replies
  • 1 kudos
Latest Reply
INJUSTIC
Visitor
  • 1 kudos

Have you found the solution? Thanks

  • 1 kudos
2 More Replies
jeremy98
by Visitor
  • 40 Views
  • 1 replies
  • 0 kudos

Ways to write fast millions of rows inside a new delta table

Hello everyone,I am facing an issue with writing 100–500 million rows (partitioned by a column) into a newly created Delta table. I have set up a cluster with 256 GB of memory and 64 cores. However, the following code takes a considerable amount of t...

  • 40 Views
  • 1 replies
  • 0 kudos
Latest Reply
jeremy98
Visitor
  • 0 kudos

Someone can help me?

  • 0 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels