Data Engineering

Forum Posts

Sorted by:

by Malthe • New Contributor II

2m ago

0 Views
0 replies
0 kudos

Parametrize DLT pipeline

If I'm using Databricks Asset Bundles, how would I parametrize a DLT pipeline based on a static configuration file.In pseudo-code, I would have a .py-file:import dlt # Something that pulls a pipeline resource (or artifact) and parses from JSON table...

Data Engineering

0 Views
0 replies
0 kudos

2m ago

by vziog • Visitor

yesterday

19 Views
1 replies
0 kudos

Costs from cost managem azure portal are not allligned with costs calculated from usage system table

Hello,the costs regarding the databricks service from cost management in azure portal (45,869...) are not allligned with costs calculated from usage system table (75,34). The costs from the portal are filtered based on the desired period (usage_date ...

Data Engineering

19 Views
1 replies
0 kudos

yesterday

View Replies

Latest Reply

matthew554
Visitor

yesterday

0 kudos

@vziog wrote:Hello,the costs regarding the databricks service from cost management in azure portal (45,869...) are not allligned with costs calculated from usage system table (75,34). The costs from the portal are filtered based on the desired period...

0 kudos

yesterday

by dc-rnc • New Contributor III

Wednesday

822 Views
3 replies
0 kudos

DAB | Set tag based on job parameter

Hi Community.Since I wasn't able to find a way to set a job tag dynamically at runtime based on a parameter that is passed to the job, I was wondering if it is possible or if there is an equivalent way to do it.Thank you. Regards.

Data Engineering

822 Views
3 replies
0 kudos

Wednesday

View Replies

Latest Reply

BigRoux
Databricks Employee

Wednesday

0 kudos

Based on the provided context, it appears that there isn't a direct way within Databricks to dynamically set job tags at runtime based on a parameter passed to the job. However, there are alternative approaches you can consider to work around this li...

0 kudos

Wednesday

2 More Replies

by cs_de • Visitor

yesterday

40 Views
4 replies
3 kudos

How do I deploy or run one job if I have multiple jobs in a Databricks Asset Bundle?

How do I deploy or run a single job if I have 2 or more jobs defined in my asset bundle?$databricks bundle deploy job1 #does not work I do not see a flag to identify what job to run.

Data Engineering

40 Views
4 replies
3 kudos

yesterday

View Replies

Latest Reply

mark_ott
Databricks Employee

yesterday

3 kudos

I haven't done it with multiple jobs, but I think under resources you name multiple jobs, then when you deploy you just call that job key.

3 kudos

yesterday

3 More Replies

by petitregny • New Contributor

Wednesday

59 Views
1 replies
0 kudos

Reading from an S3 bucket using boto3 on serverless cluster

Hello All,I am trying to read a CSV file from my S3 bucket in a notebook running on serverless.I am using the two standard functions below, but I get a credentials error (Error reading CSV from S3: Unable to locate credentials).I don't have this issu...

Data Engineering

59 Views
1 replies
0 kudos

Wednesday

View Replies

Latest Reply

cgrant
Databricks Employee

yesterday

0 kudos

For use cases where you want to use cloud service credentials to authenticate to cloud services, I recommend using Unity Catalog Service Credentials. These work with serverless and class compute in Databricks. You'd create a service credential, and t...

0 kudos

yesterday

by smpa01 • New Contributor

yesterday

15 Views
0 replies
0 kudos

Debugging jobs/run-now endpoint

I am not being able to run jobs/runnow endpoint. I am getting an error asError fetching files: 403 - {"error_code":"PERMISSION_DENIED","message":"User xxxx-dxxxx-xxx-xxxx does not have Manage Run or Owner or Admin permissions on job 437174060919465",...

Data Engineering

15 Views
0 replies
0 kudos

yesterday

by GregTyndall • New Contributor II

12-03-2024 5:43:50 AM

637 Views
3 replies
0 kudos

Resolved! Materialized View Refresh - NUM_JOINS_THRESHOLD_EXCEEDED?

I have a very basic view with 3 inner joins that will only do a full refresh. Is there a limit to the number of joins you can have and still get an incremental refresh?"incrementalization_issues": [{"issue_type": "INCREMENTAL_PLAN_REJECTED_BY_COST_MO...

Data Engineering

637 Views
3 replies
0 kudos

12-03-2024 5:43:50 AM

View Replies

Latest Reply

PotnuruSiva
Databricks Employee

12-04-2024 6:07:17 AM

0 kudos

@GregTyndall Yes, the current limit is 2 by default. But we can increase up to 5 with the below flag added to the pipeline settings. pipelines.enzyme.numberOfJoinsThreshold 5

0 kudos

12-04-2024 6:07:17 AM

2 More Replies

by Christian_C • Visitor

yesterday

11 Views
0 replies
0 kudos

Google Pub Sub and Delta live table

I am using delta live table and pub sub to ingest message from 30 different topics in parallel. I noticed that initialization time can be very long around 15 minutes. Does someone knows how to reduced initialization time in dlt ? Thanks You

Data Engineering

11 Views
0 replies
0 kudos

yesterday

by Chris_sh • New Contributor II

10-25-2023 11:00:07 AM

2229 Views
2 replies
1 kudos

[STREAMING_TABLE_OPERATION_NOT_ALLOWED.REQUIRES_SHARED_COMPUTE]

Currently trying to refresh a Delta Live Table using a Full Refresh but an error keeps coming up saying that we have to use a shared cluster or a SQL warehouse. I've tried both a shared cluster and a SQL warehouse and the same error keeps coming up. ...

Data Engineering

2229 Views
2 replies
1 kudos

10-25-2023 11:00:07 AM

View Replies

Latest Reply

BigRoux
Databricks Employee

yesterday

1 kudos

You are not using "No Isolation Shared" mode, right? Also, can you share the chunk of code that is causing the failure? Thanks, Louis.

1 kudos

yesterday

1 More Replies

by guest0 • New Contributor II

a week ago

549 Views
6 replies
3 kudos

Spark UI Simulator Not Accessible

Hello,The Spark UI Simulator is not accessible since the last few days. I was able to refer to it last week, at https://www.databricks.training/spark-ui-simulator/index.html. I already have access to partner academy (if that is any relevant). <Error...

Data Engineering

simulator

spark-ui

549 Views
6 replies
3 kudos

a week ago

View Replies

Latest Reply

guest0
New Contributor II

yesterday

3 kudos

Just a short update: the request I raised was closed saying there is no active support contract with the org (from the email I used) to look into this. Perhaps someone else could try raising a request using the link above.

3 kudos

yesterday

5 More Replies

by HaripriyaP • New Contributor

04-17-2024 6:38:07 AM

816 Views
3 replies
0 kudos

Multiple Notebooks Migration from one workspace to another without using Git.

Hi all!I need to migrate multiple notebooks from one workspace to another. Is there any way to do it without using Git?Since Manual Import and Export is difficult to do for multiple notebooks and folders, need an alternate solution.Please reply as so...

Data Engineering

816 Views
3 replies
0 kudos

04-17-2024 6:38:07 AM

View Replies

Latest Reply

RiyazAli
Valued Contributor II

yesterday

0 kudos

Hello @HaripriyaP , @rabia_farooq check for the databricks cli documentation here - https://docs.databricks.com/aws/en/dev-tools/cli/commandsYou can work with databricks workflow <command_such_as_export> and move the notebooks/code files around.Cheer...

0 kudos

yesterday

2 More Replies

by Vasu_Kumar_T • New Contributor II

Wednesday

92 Views
3 replies
1 kudos

Data Migration using Bladebridge

Hi,We are planning to migrate from Teradata to Databricks using Bladebridge. Going through various portals, I am not able to conclude the component that facilitates Data movement between Teradata and databricks.Please clarify end to end tool and acti...

Data Engineering

92 Views
3 replies
1 kudos

Wednesday

View Replies

Latest Reply

RiyazAli
Valued Contributor II

yesterday

1 kudos

I'm not aware if blade bridge has data movement tool handy with them.I don't see anything advertised by them though.Let me know if you find anything on this.

1 kudos

yesterday

2 More Replies

by minhhung0507 • Contributor III

Tuesday

217 Views
11 replies
2 kudos

API for Restarting Individual Failed Tasks within a Job?

Hi everyone,I'm exploring ways to streamline my workflow in Databricks and could really use some expert advice. In my current setup, I have a job (named job_silver) with multiple tasks (e.g., task 1, task 2, task 3). When one of these tasks fails—say...

Data Engineering

217 Views
11 replies
2 kudos

Tuesday

View Replies

Latest Reply

RiyazAli
Valued Contributor II

Wednesday

2 kudos

Hey @minhhung0507 - quick question - what is the cluster type you're using to run your workflow?I'm using a shared, interactive cluster, so I'm passing the parameter {'existing_cluster_id' : task['existing_cluster_id']}in the payload. This parameter ...

2 kudos

Wednesday

10 More Replies

by VicS • New Contributor III

yesterday

32 Views
0 replies
0 kudos

How can I use Terraform to assign an external location to multiple workspaces?

How can I use Terraform to assign an external location to multiple workspaces?When I create an external location with Terraform, I do not see any option to directly link workspaces. it also only links to the workspace of the databricks profile that I...

Data Engineering

32 Views
0 replies
0 kudos

yesterday

by yashojha1995 • New Contributor

a week ago

137 Views
1 replies
0 kudos

Error while running update statement using delta lake linked service through ADF

Hi All, I am getting the below error while running an update query in a lookup activity using the delta lake linked service:ErrorCode=AzureDatabricksCommandError,Hit an error when running the command in Azure Databricks. Error details: <span class='a...

Data Engineering

137 Views
1 replies
0 kudos

a week ago

View Replies

Latest Reply

RiyazAli
Valued Contributor II

yesterday

0 kudos

Hi @yashojha1995 EOL while scanning string literal hints that there might be a syntax error in the update query.could you share your update query here, and any other info such as how are you creating a Linked service to your delta lake? Does it mean ...

0 kudos

yesterday

User

Count

1611

768

347

286

252

Databricks Community

Forum Posts

Parametrize DLT pipeline

Costs from cost managem azure portal are not allligned with costs calculated from usage system table

DAB | Set tag based on job parameter

How do I deploy or run one job if I have multiple jobs in a Databricks Asset Bundle?

Reading from an S3 bucket using boto3 on serverless cluster

Debugging jobs/run-now endpoint

Resolved! Materialized View Refresh - NUM_JOINS_THRESHOLD_EXCEEDED?

Google Pub Sub and Delta live table

[STREAMING_TABLE_OPERATION_NOT_ALLOWED.REQUIRES_SHARED_COMPUTE]

Spark UI Simulator Not Accessible

Multiple Notebooks Migration from one workspace to another without using Git.

Data Migration using Bladebridge

API for Restarting Individual Failed Tasks within a Job?

How can I use Terraform to assign an external location to multiple workspaces?

Error while running update statement using delta lake linked service through ADF

Join Us as a Local Community Builder!

Unity Catalog Table in Databricks Asset Bundle

Databricks data engineer associate exam

How to delete/empty notebook output

Databricks Cluster Policies

toml file syntax highlighting