cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

minhhung0507
by Contributor III
  • 136 Views
  • 10 replies
  • 2 kudos

API for Restarting Individual Failed Tasks within a Job?

Hi everyone,I'm exploring ways to streamline my workflow in Databricks and could really use some expert advice. In my current setup, I have a job (named job_silver) with multiple tasks (e.g., task 1, task 2, task 3). When one of these tasks fails—say...

  • 136 Views
  • 10 replies
  • 2 kudos
Latest Reply
RiyazAli
Valued Contributor II
  • 2 kudos

Hey @minhhung0507 - quick question - what is the cluster type you're using to run your workflow?I'm using a shared, interactive cluster, so I'm passing the parameter {'existing_cluster_id' : task['existing_cluster_id']}in the payload. This parameter ...

  • 2 kudos
9 More Replies
Vasu_Kumar_T
by New Contributor II
  • 51 Views
  • 2 replies
  • 1 kudos

Data Migration using Bladebridge

Hi,We are planning to migrate from Teradata to Databricks using Bladebridge. Going through various portals, I am not able to conclude the component that facilitates Data movement between Teradata and databricks.Please clarify end to end tool and acti...

  • 51 Views
  • 2 replies
  • 1 kudos
Latest Reply
Vasu_Kumar_T
New Contributor II
  • 1 kudos

Agreed, I am exploring the TPT and other options which are agreeable to customer.Meanwhile ,My question is do we have a Bladebridge component that facilitates Data movement between Teradata and databricks.If its available which component of Bladebrid...

  • 1 kudos
1 More Replies
yashojha1995
by New Contributor
  • 99 Views
  • 1 replies
  • 0 kudos

Error while running update statement using delta lake linked service through ADF

Hi All, I am getting the below error while running an update query in a lookup activity using the delta lake linked service:ErrorCode=AzureDatabricksCommandError,Hit an error when running the command in Azure Databricks. Error details: <span class='a...

  • 99 Views
  • 1 replies
  • 0 kudos
Latest Reply
RiyazAli
Valued Contributor II
  • 0 kudos

Hi @yashojha1995 EOL while scanning string literal hints that there might be a syntax error in the update query.could you share your update query here, and any other info such as how are you creating a Linked service to your delta lake? Does it mean ...

  • 0 kudos
Y2KEngineer
by Visitor
  • 22 Views
  • 1 replies
  • 0 kudos

Query limiting to only 10000 rows

Hi I am query my Azure Databricks table using VB script/Simba Spark ODBC driver.While querying into the DB(lets say 'Select * from table_1') it is not returning any data. However while querying a limit (lets say 'Select TOP 10000 ID from table_1'), i...

Data Engineering
community
limitation in databricks
  • 22 Views
  • 1 replies
  • 0 kudos
Latest Reply
John93Burgess
  • 0 kudos

Hello!The issue where a full table query in Azure Databricks via VB Script and Simba Spark ODBC returns no data, but TOP works, likely stems from the ODBC driver's inability to handle the large result set at once. Investigate and increase the driver'...

  • 0 kudos
Dharinip
by New Contributor III
  • 2081 Views
  • 5 replies
  • 3 kudos

Resolved! How to decide on creating views vs Tables in Gold layer?

We have the following use case:We receive raw form of data from an application and that is ingested in the Iron Layer. The raw data is in the JSON FormatThe Bronze layer will the first level of transformation. The flattening of the JSON file happens ...

  • 2081 Views
  • 5 replies
  • 3 kudos
Latest Reply
artus2050189155
  • 3 kudos

The whole medallion architecture is unnecesarily complex.   Bronze, Silver, Gold.  Some places I have seen people do -  RAW , Trusted RAW , Silver, Trusted Silver, Gold

  • 3 kudos
4 More Replies
guest0
by New Contributor II
  • 457 Views
  • 4 replies
  • 1 kudos

Spark UI Simulator Not Accessible

Hello,The Spark UI Simulator is not accessible since the last few days. I was able to refer to it last week, at https://www.databricks.training/spark-ui-simulator/index.html. I already have access to partner academy (if that is any relevant).  <Error...

Data Engineering
simulator
spark-ui
  • 457 Views
  • 4 replies
  • 1 kudos
Latest Reply
RiyazAli
Valued Contributor II
  • 1 kudos

Looks like the Spark UI simulator is no longer available, more over the Apache Spark optimization course also vanished from the Databricks academy. I've seen that there's a new cohort named Databricks optimization but it lacks the depth and experienc...

  • 1 kudos
3 More Replies
jonhieb
by New Contributor III
  • 611 Views
  • 4 replies
  • 0 kudos

Resolved! [Databricks Asset Bundles] Triggering Delta Live Tables

I would like to know how to schedule a DLT pipeline using DAB's.I'm trying to trigger a Delta Live Table pipeline using Databricks Asset Bundles. Below is my YAML code:resources:  pipelines:    data_quality_pipelines:      name: data_quality_pipeline...

  • 611 Views
  • 4 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

As of now, Databricks Asset Bundles do not support direct scheduling of DLT pipelines using cron expressions within the bundle configuration. Instead, you can achieve scheduling by creating a Databricks job that triggers the DLT pipeline and then sch...

  • 0 kudos
3 More Replies
dc-rnc
by New Contributor II
  • 791 Views
  • 1 replies
  • 0 kudos

DAB | Set tag based on job parameter

Hi Community.Since I wasn't able to find a way to set a job tag dynamically at runtime based on a parameter that is passed to the job, I was wondering if it is possible or if there is an equivalent way to do it.Thank you. Regards.

  • 791 Views
  • 1 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

Based on the provided context, it appears that there isn't a direct way within Databricks to dynamically set job tags at runtime based on a parameter passed to the job. However, there are alternative approaches you can consider to work around this li...

  • 0 kudos
HoussemBL
by New Contributor III
  • 149 Views
  • 4 replies
  • 1 kudos

DLT Pipeline & Automatic Liquid Clustering Syntax

Hi everyone,I noticed Databricks recently released the automatic liquid clustering feature, which looks very promising. I'm currently implementing a DLT pipeline and would like to leverage this new functionality.However, I'm having trouble figuring o...

  • 149 Views
  • 4 replies
  • 1 kudos
Latest Reply
RiyazAli
Valued Contributor II
  • 1 kudos

Hey @HoussemBL You're correct about DLT not support Auto LC. You can assign any columns in the cluster_by but if you set it to auto, it will throw an error complaining about auto not being present in the list of columns.Maybe, altering thee table to ...

  • 1 kudos
3 More Replies
manish_tanwar
by New Contributor
  • 274 Views
  • 5 replies
  • 3 kudos

Databricks streamlit app for data ingestion in a table

I am using this code in a notebook to save data row on table. And it is working perfectly. And now I am using the same function to save data from a chatbot in streamlit chatbot application of databricks and I am getting error for ERROR ##############...

  • 274 Views
  • 5 replies
  • 3 kudos
Latest Reply
pradeepvatsvk
New Contributor III
  • 3 kudos

Hi @manish_tanwar  how can we work with streamlit apps in databricks , i have a use case where i want to ingest data from different csv files and ingest it into delta tables 

  • 3 kudos
4 More Replies
harman
by New Contributor II
  • 181 Views
  • 3 replies
  • 0 kudos

Serverless Compute

Hi Team,We are using Azure Databricks Serverless Compute to execute workflows and notebooks. My question is :Does serverless compute support Maven library installations?I appreciate any insights or suggestions you might have. Thanks in advance for yo...

  • 181 Views
  • 3 replies
  • 0 kudos
Latest Reply
BigRoux
Databricks Employee
  • 0 kudos

So, it appears that the there is conflicting documentation on this topic.  I checked with our internal documention and what I found was that you CANNOT install JDBC or ODBC drivers on Serverless.  See limitations here: https://docs.databricks.com/aws...

  • 0 kudos
2 More Replies
annagriv
by New Contributor II
  • 3359 Views
  • 6 replies
  • 5 kudos

Resolved! How to get git commit ID of the repository the script runs on?

I have a script in a repository on DataBricks. The script should log the current git commit ID of the repository. How can that be implemented? I tried various command, for example: result = subprocess.run(['git', 'rev-parse', 'HEAD'], stdout=subproce...

  • 3359 Views
  • 6 replies
  • 5 kudos
Latest Reply
bestekov
New Contributor II
  • 5 kudos

Here is a version of @vr 's solution that can be run from any folder within the rep. It uses regex to extract the root from the path in the form of \Repos\<username>\<some-repo:import os import re from databricks.sdk import WorkspaceClient w = Worksp...

  • 5 kudos
5 More Replies
Vasu_Kumar_T
by New Contributor II
  • 77 Views
  • 3 replies
  • 0 kudos

Default Code generated by Bladebridge converter

Hello all ,1. What is the default code generated by Bladebridge converter.for eg : When we migrate Teradat, Oracle to Databricks using Bladebridge whats the default code base.2.If the generated code is PYSPARK, do I have any control over the generate...

  • 77 Views
  • 3 replies
  • 0 kudos
Latest Reply
RiyazAli
Valued Contributor II
  • 0 kudos

Hello @Vasu_Kumar_T - We've used Bladebridge to convert from Redshift to Databricks. Bladebridge can definetly convert to Spark SQL, not sure about Scala Spark though.

  • 0 kudos
2 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels