Data Engineering

Forum Posts

Sorted by:

by JensH • New Contributor III

01-11-2024 3:28:39 AM

2666 Views
3 replies
2 kudos

Resolved! How to pass parameters to a "Job as Task" from code?

Hi,I would like to use the new "Job as Task" feature but Im having trouble to pass values.ScenarioI have a workflow job which contains 2 tasks.Task_A (type "Notebook"): Read data from a table and based on the contents decide, whether the workflow in ...

Data Engineering

job

parameters

workflow

2666 Views
3 replies
2 kudos

01-11-2024 3:28:39 AM

View Replies

Latest Reply

Walter_C
Valued Contributor II

01-27-2024 6:31:33 AM

2 kudos

I found the following information: value is the value for this task value’s key. This command must be able to represent the value internally in JSON format. The size of the JSON representation of the value cannot exceed 48 KiB.You can refer to https...

2 kudos

01-27-2024 6:31:33 AM

2 More Replies

by Alessandro • New Contributor

01-25-2024 1:01:54 AM

405 Views
1 replies
0 kudos

Update jobs parameter, when running, from API

Hi, When a Job is running, I would like to change the parameters with an API call.I know that I can set parameters value from API when I start a job from API, or that I can update the default value if the job isn't running, but I didn't find an API c...

Data Engineering

405 Views
1 replies
0 kudos

01-25-2024 1:01:54 AM

View Replies

Latest Reply

Walter_C
Valued Contributor II

01-27-2024 6:26:29 AM

0 kudos

No, there is currently no option to change parameters while the job is running, from the UI you will be able to modify them but it wont affect the current run, it will be applied on the new job runs you trigger.

0 kudos

01-27-2024 6:26:29 AM

by User16826992185 • New Contributor II

06-15-2021 5:59:02 AM

5713 Views
2 replies
3 kudos

Databricks Auto-Loader vs. Delta Live Tables

What is the difference between Databricks Auto-Loader and Delta Live Tables? Both seem to manage ETL for you but I'm confused on where to use one vs. the other.

Data Engineering

5713 Views
2 replies
3 kudos

06-15-2021 5:59:02 AM

View Replies

Latest Reply

SteveL
New Contributor II

01-26-2024 2:27:25 PM

3 kudos

You say "...__would__ be a piece..." and "...DLT __would__ pick up...".Is DLT built upon AL?

3 kudos

01-26-2024 2:27:25 PM

1 More Replies

by Shivam_Pawar • New Contributor III

10-10-2022 3:26:17 AM

8138 Views
11 replies
4 kudos

Databricks Lakehouse Fundamentals Badge

I have successfully passed the test after completion of the course with 95%. But I have'nt recieved any badge from your side as promised. I have been provided with a certificate which looks fake by itself. I need to post my credentials on Linkedin wi...

Data Engineering

8138 Views
11 replies
4 kudos

10-10-2022 3:26:17 AM

View Replies

Latest Reply

Shruti_Prajapat
New Contributor II

01-26-2024 10:49:36 AM

4 kudos

Even I'm facing similar issue. I have completed the training and the quiz successful and able to download a course completion certificate. Certificate doesn't have any ID and looking very generic and fake. Have signed up for the https://credentials.d...

4 kudos

01-26-2024 10:49:36 AM

10 More Replies

by Maxi1693 • New Contributor II

01-26-2024 6:19:27 AM

1034 Views
1 replies
0 kudos

Resolved! Error java.lang.NullPointerException using Autoloader

Hi!I am pulling data from a Blob storage to Databrick using Autoloader. This process is working well for almost 10 resources, but for a specific one I am getting this error java.lang.NullPointerException.Looks like this issue in when I connect to th...

Data Engineering

1034 Views
1 replies
0 kudos

01-26-2024 6:19:27 AM

View Replies

Latest Reply

shan_chandra
Honored Contributor III

01-26-2024 9:32:51 AM

0 kudos

@Maxi1693 - The value for the schemaEvolutionMode should be a string. could you please try changing the below from .option("cloudFiles.schemaEvolutionMode", None) to .option("cloudFiles.schemaEvolutionMode", "none") and let us know. Refe...

0 kudos

01-26-2024 9:32:51 AM

by FurqanAmin • New Contributor II

01-22-2024 3:07:28 AM

913 Views
5 replies
1 kudos

Logs not coming up in the UI - while being written to DBFS

I have a few spark-submit jobs that are being run via Databricks workflows. I have configured logging in DBFS and specified a location in my GCS bucket.The logs are present in that GCS bucket for the latest run but whenever I try to view them from th...

Data Engineering

logging

LOGS

913 Views
5 replies
1 kudos

01-22-2024 3:07:28 AM

View Replies

Latest Reply

Lakshay
Esteemed Contributor

01-25-2024 9:53:28 AM

1 kudos

Yes, I meant to set it to None. Is the issue specific to any particular cluster? Or do you see the issue with all the clusters in your workspace?

1 kudos

01-25-2024 9:53:28 AM

4 More Replies

by Noman_Q • New Contributor II

01-23-2024 4:46:28 PM

589 Views
2 replies
1 kudos

Error Running Delta Live Pipeline.

Hi Guys, I am new to the Delta pipeline. I have created a pipeline and now when i try to run the pipeline i get the error message "PERMISSION_DENIED: You are not authorized to create clusters. Please contact your administrator" even though I can crea...

Data Engineering

589 Views
2 replies
1 kudos

01-23-2024 4:46:28 PM

View Replies

Latest Reply

Noman_Q
New Contributor II

01-26-2024 1:22:29 AM

1 kudos

Thank you for responding @Palash01 . thanks for giving me the direction so to get around it i had to get permission to "unrestricted cluster creation".

1 kudos

01-26-2024 1:22:29 AM

1 More Replies

by rt-slowth • Contributor

01-15-2024 12:07:53 AM

556 Views
3 replies
0 kudos

why the userIdentity is anonymous?

Do you know why the userIdentity is anonymous in AWS Cloudtail's logs even though I have specified an instance profile?

Data Engineering

556 Views
3 replies
0 kudos

01-15-2024 12:07:53 AM

View Replies

Latest Reply

Kaniz
Community Manager

01-18-2024 1:27:52 AM

0 kudos

Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?This...

0 kudos

01-18-2024 1:27:52 AM

2 More Replies

by joeyslaptop • New Contributor II

01-17-2024 8:27:50 PM

2021 Views
5 replies
2 kudos

How to add a column to a new table containing the original source filenames in DataBricks.

If this isn't the right spot to post this, please move it or refer me to the right area.I recently learned about the "_metadata.file_name". It's not quite what I need.I'm creating a new table in DataBricks and want to add a USR_File_Name column cont...

Data Engineering

Databricks

filename

import

SharePoint

Upload

2021 Views
5 replies
2 kudos

01-17-2024 8:27:50 PM

View Replies

Latest Reply

Debayan
Esteemed Contributor III

01-18-2024 11:15:35 AM

2 kudos

Hi, Could you please elaborate more on the expectation here?

2 kudos

01-18-2024 11:15:35 AM

4 More Replies

by William_Scardua • Valued Contributor

01-25-2024 9:07:46 AM

242 Views
1 replies
0 kudos

Cluster types pricing

Hy guys,How can I get the pricing of cluster types (standard_D*, standard_E*, standart_F*, etc.) ?Im doing a study to decrease the price of my actual cluster.Have any idea ?Thank you, thank you

Data Engineering

242 Views
1 replies
0 kudos

01-25-2024 9:07:46 AM

View Replies

Latest Reply

Lakshay
Esteemed Contributor

01-25-2024 9:49:29 AM

0 kudos

Hey, you can use the pricing calculator here: https://www.databricks.com/product/pricing/product-pricing/instance-types

0 kudos

01-25-2024 9:49:29 AM

by JJ_LVS1 • New Contributor III

03-22-2023 1:17:37 PM

1306 Views
4 replies
1 kudos

FiscalYear Start Period Is not Correct

Hi, I'm trying to create a calendar dimension including a fiscal year with a fiscal start of April 1. I'm using the fiscalyear library and am setting the start to month 4 but it insists on setting April to month 7.runtime 12.1My code snipet is:start_...

Data Engineering

1306 Views
4 replies
1 kudos

03-22-2023 1:17:37 PM

View Replies

Latest Reply

DataEnginner
New Contributor II

01-25-2024 7:53:11 AM

1 kudos

import fiscalyear import datetime def get_fiscal_date(year,month,day): fiscalyear.setup_fiscal_calendar(start_month=4) v_fiscal_month=fiscalyear.FiscalDateTime(year, month, day).fiscal_month #To get the Fiscal Month v_fiscal_quarter=fiscalyea...

1 kudos

01-25-2024 7:53:11 AM

3 More Replies

by harlemmuniz • New Contributor II

01-24-2024 12:33:22 PM

455 Views
2 replies
1 kudos

Issue with Job Versioning with “Run Job” tasks and Deployments between envinronments

Hello,I am writing to bring to your attention an issue that we have encountered while working with Databricks and seek your assistance in resolving it.When running a Job of Workflow with the task "Run Job" and clicking on "View YAML/JSON," we have ob...

Data Engineering

455 Views
2 replies
1 kudos

01-24-2024 12:33:22 PM

View Replies

Latest Reply

harlemmuniz
New Contributor II

01-25-2024 5:11:00 AM

1 kudos

Hi @Kaniz, thank you for your fast response.However, the versioned JSON or YAML (via Databricks Asset Bundle) in the Job UI should also include the job_name, or we have to change it manually by replacing the job_id with the job_name. For this reason,...

1 kudos

01-25-2024 5:11:00 AM

1 More Replies

by 442027 • New Contributor II

07-06-2023 1:43:57 PM

443 Views
1 replies
0 kudos

Default delta log retention interval is different than in documentation?

It notes in the documentation here that the default delta log retention interval is 30 days - however when I create checkpoints in the delta log to trigger the cleanup - historical records from 30 days aren't removed; i.e. current day checkpoint is a...

Data Engineering

443 Views
1 replies
0 kudos

07-06-2023 1:43:57 PM

View Replies

Latest Reply

jose_gonzalez
Moderator

01-24-2024 3:50:23 PM

0 kudos

you need to set SET TBLPROPERTIES ('delta.checkpointRetentionDuration' = '30 days',)

0 kudos

01-24-2024 3:50:23 PM

by Mrk • New Contributor II

07-17-2023 9:06:15 AM

3753 Views
4 replies
3 kudos

Resolved! Insert or merge into a table with GENERATED IDENTITY

Hi,When I create an identity column using the GENERATED ALWAYS AS IDENTITY statement and I try to INSERT or MERGE data into that table I keep getting the following error message:Cannot write to 'table', not enough data columns; target table has x col...

Data Engineering

3753 Views
4 replies
3 kudos

07-17-2023 9:06:15 AM

View Replies

Latest Reply

Aboladebaba
New Contributor II

01-24-2024 2:55:00 PM

3 kudos

You can run the INSERT by passing the subset of columns you want to provide values for... for example your insert statement would be something like:INSERT INTO target_table_with_identity_col(<list-of-cols-names-without-the-identity-column>SELECT(<lis...

3 kudos

01-24-2024 2:55:00 PM

3 More Replies

by ilarsen • Contributor

11-21-2023 3:05:41 PM

947 Views
3 replies
1 kudos

Structured Streaming Auto Loader UnknownFieldsException and Workflow Retries

Hi. I am using structured streaming and auto loader to read json files, and it is automated by Workflow. I am having difficulties with the job failing as schema changes are detected, but not retrying. Hopefully someone can point me in the right dir...

Data Engineering

947 Views
3 replies
1 kudos

11-21-2023 3:05:41 PM

View Replies

Latest Reply

ilarsen
Contributor

01-24-2024 12:25:50 PM

1 kudos

Another point I have realised, is that the task and the parent notebook (which then calls the child notebook that runs the auto loader part) does not fail if the schema-changed failure occurs during the auto loader process. It's the child notebook a...

1 kudos

01-24-2024 12:25:50 PM

2 More Replies

User

Count

1602

736

343

284

247

Databricks

Forum Posts

Resolved! How to pass parameters to a "Job as Task" from code?

Update jobs parameter, when running, from API

Databricks Auto-Loader vs. Delta Live Tables

Databricks Lakehouse Fundamentals Badge

Resolved! Error java.lang.NullPointerException using Autoloader

Logs not coming up in the UI - while being written to DBFS

Error Running Delta Live Pipeline.

why the userIdentity is anonymous?

How to add a column to a new table containing the original source filenames in DataBricks.

Cluster types pricing

FiscalYear Start Period Is not Correct

Issue with Job Versioning with “Run Job” tasks and Deployments between envinronments

Default delta log retention interval is different than in documentation?

Resolved! Insert or merge into a table with GENERATED IDENTITY

Structured Streaming Auto Loader UnknownFieldsException and Workflow Retries

Best way to parse Google Analytics data in Databri...

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...