Data Engineering

Forum Posts

Sorted by:

by jim12321 • New Contributor II

03-12-2024 11:15:34 AM

309 Views
2 replies
0 kudos

Databricks CLI how to start a job and pass the parameters?

I try to start a job ID 85218616788189 and pass one parameters 'demo' in Windows Shell.This works:databricks jobs run-now 85218616788189 If I try this one,databricks jobs run-now --json '{"job_id":85218616788189,"notebook_params": {"demo":"parameter...

Data Engineering

309 Views
2 replies
0 kudos

03-12-2024 11:15:34 AM

View Replies

Latest Reply

VVS29
New Contributor II

03-12-2024 12:53:03 PM

0 kudos

Hi Jim, I think the right syntax would be something like this: databricks jobs run-now --job-id 85218616788189 --notebook-params '{"demo":"parameter successful"}'. Let me know if that worked!

0 kudos

03-12-2024 12:53:03 PM

1 More Replies

by dbal • New Contributor III

03-13-2024 7:29:01 AM

703 Views
2 replies
0 kudos

Resolved! Spark job task fails with "java.lang.NoClassDefFoundError: org/apache/spark/SparkContext$"

Hi.I am trying to run a Spark Job in Databricks (Azure) using the JAR type.I can't figure out why the job fails to run by not finding the SparkContext.Databricks Runtime: 14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12)Error message: java.lang.NoCl...

Data Engineering

703 Views
2 replies
0 kudos

03-13-2024 7:29:01 AM

View Replies

Latest Reply

dbal
New Contributor III

03-13-2024 10:51:27 AM

0 kudos

Update 2: I found the reason in the documentation. This is documented under "Access Mode", and it is a limitation of the Shared access mode.Link: https://learn.microsoft.com/en-us/azure/databricks/compute/access-mode-limitations#spark-api-limitations...

0 kudos

03-13-2024 10:51:27 AM

1 More Replies

by msgrac • New Contributor II

03-11-2024 12:09:27 PM

328 Views
2 replies
0 kudos

Cant remove file on ADLS using dbutils.fs.rm because url contains illeagal character

The URL contains a "[" within, and I've tried to encode the path from "[" to "%5B%27", but it didn't work: from urllib.parse import quotepath = ""encoded_path = quote(path)

Data Engineering

328 Views
2 replies
0 kudos

03-11-2024 12:09:27 PM

View Replies

Latest Reply

Kaniz
Community Manager

03-12-2024 1:15:52 AM

0 kudos

Hi @msgrac, To encode it, you should use %5B instead of trying to encode it as “%5B%27”.

0 kudos

03-12-2024 1:15:52 AM

1 More Replies

by Tam • New Contributor III

02-08-2024 6:32:12 PM

576 Views
2 replies
0 kudos

TABLE_REDIRECTION_ERROR in AWS Athena After Databricks Upgrade to 14.3 LTS

I have a Databricks pipeline set up to create Delta tables on AWS S3, using Glue Catalog as the Metastore. I was able to query the Delta table via Athena successfully. However, after upgrading Databricks Cluster from 13.3 LTS to 14.3 LTS, I began enc...

Data Engineering

576 Views
2 replies
0 kudos

02-08-2024 6:32:12 PM

View Replies

Latest Reply

Kaniz
Community Manager

02-14-2024 2:07:00 AM

0 kudos

Hi @Tam, It appears that you’ve encountered a TABLE_REDIRECTION_ERROR while working with your Databricks pipeline, AWS S3, Glue Catalog, and Athena. Let’s break down the issue and explore potential solutions: AWS Glue as a Catalog for Databric...

0 kudos

02-14-2024 2:07:00 AM

1 More Replies

by Coders • New Contributor II

03-12-2024 12:12:34 PM

415 Views
2 replies
0 kudos

How to do perform deep clone for data migration from one Datalake to another?

I'm attempting to migrate data from Azure Data Lake to S3 using deep clone. The data in the source Data Lake is stored in Parquet format and partitioned. I've tried to follow the documentation from Databricks, which suggests that I need to register ...

Data Engineering

415 Views
2 replies
0 kudos

03-12-2024 12:12:34 PM

View Replies

Latest Reply

Kaniz
Community Manager

03-13-2024 2:57:01 AM

0 kudos

Hi @Coders, It appears that you’re encountering an issue while attempting to migrate data from Azure data lake to S3 using deep clone. Let’s break down the problem and explore potential solutions. Error Explanation: The error message you receive...

0 kudos

03-13-2024 2:57:01 AM

1 More Replies

by data-warriors • New Contributor

02-29-2024 10:23:07 AM

381 Views
1 replies
0 kudos

workspace deletion at Databricks recovery

Hi Team,I accidentally deleted our databricks workspace, which had all our artefacts and control plane, and was the primary resource for our team's working environment.Could anyone please help on priority, regarding the recovery/ restoration mechanis...

Data Engineering

381 Views
1 replies
0 kudos

02-29-2024 10:23:07 AM

View Replies

Latest Reply

Kaniz
Community Manager

03-13-2024 5:48:16 AM

0 kudos

Hi @data-warriors, I understand the urgency of your situation. Unfortunately, once a Databricks subscription is cancelled, all associated workspaces are permanently deleted and cannot be recovered1

0 kudos

03-13-2024 5:48:16 AM

by Poonam17 • New Contributor II

03-13-2024 4:48:40 AM

571 Views
1 replies
2 kudos

Not able to deploy cluster in databricks community edition

Hello team, I am not able to launch databricks cluster in community edition. automatically its getting terminated. Can someone please help here ? Regards.,poonam

Data Engineering

571 Views
1 replies
2 kudos

03-13-2024 4:48:40 AM

View Replies

Latest Reply

kakalouk
New Contributor II

03-13-2024 5:25:00 AM

2 kudos

I face the exact same problem. The message i get is this:"Bootstrap Timeout:Node daemon ping timeout in 780000 ms for instance i-062042a9d4be8725e @ 10.172.197.194. Please check network connectivity between the data plane and the control plane."

2 kudos

03-13-2024 5:25:00 AM

by TheDataEngineer • New Contributor

02-29-2024 11:23:49 AM

588 Views
1 replies
0 kudos

'replaceWhere' clause in spark.write for a partitioned table

Hi, I want to be clear about 'replaceWhere' clause in spark.write.Here is the scenario:I would like to add a column to few existing records.The table is already partitioned on "PickupMonth" column.Here is example: Without 'replaceWhere'spark.read \.f...

Data Engineering

588 Views
1 replies
0 kudos

02-29-2024 11:23:49 AM

View Replies

Latest Reply

Kaniz
Community Manager

03-13-2024 5:17:25 AM

0 kudos

Hi @TheDataEngineer, Let’s dive into the details of the replaceWhere clause in Spark’s Delta Lake. The replaceWhere option is a powerful feature in Delta Lake that allows you to overwrite a subset of a table during write operations. Specifically, ...

0 kudos

03-13-2024 5:17:25 AM

by chrisf_sts • New Contributor II

03-01-2024 7:40:08 PM

513 Views
1 replies
0 kudos

Can I generate a uuid4 column when I do a COPY INTO command?

I have raw call log data and the logs don't have a unique id number so I generate a uuid4 number when i load them using spark. Now I want to save the records to a table, and run a COPY INTO command every day to ingest new records. I am only appendi...

Data Engineering

513 Views
1 replies
0 kudos

03-01-2024 7:40:08 PM

View Replies

Latest Reply

Kaniz
Community Manager

03-13-2024 5:11:20 AM

0 kudos

Hi @chrisf_sts, You can achieve this by generating UUIDs during the COPY INTO command. Here are a few approaches based on the database system you’re using: PostgreSQL: If you’re working with PostgreSQL, you can specify the columns explicitly in ...

0 kudos

03-13-2024 5:11:20 AM

by vvk • New Contributor II

03-13-2024 12:24:15 AM

500 Views
2 replies
0 kudos

Unable to upload a wheel file in Azure DevOps pipeline

Hi, I am trying to upload a wheel file to Databricks workspace using Azure DevOps release pipeline to use it in the interactive cluster. I tried "databricks workspace import" command, but looks like it does not support .whl files. Hence, I tried to u...

Data Engineering

500 Views
2 replies
0 kudos

03-13-2024 12:24:15 AM

View Replies

Latest Reply

Kaniz
Community Manager

03-13-2024 2:44:34 AM

0 kudos

Hi @vvk, Uploading Python wheel files to an Azure Databricks workspace via an Azure DevOps release pipeline involves a few steps. Let’s troubleshoot the issue you’re facing: Authorization Error: The “Authorization failed” error you’re encounteri...

0 kudos

03-13-2024 2:44:34 AM

1 More Replies

by caldempsey • New Contributor

03-02-2024 4:14:14 AM

370 Views
1 replies
0 kudos

Delta Lake Spark fails to write _delta_log via a Notebook without granting the Notebook data access

I have set up a Jupyter Notebook w/ PySpark connected to a Spark cluster, where the Spark instance is intended to perform writes to a Delta table.I'm observing that the Spark instance fails to complete the writes if the Jupyter Notebook doesn't have ...

Data Engineering

deltalake

Docker

spark

370 Views
1 replies
0 kudos

03-02-2024 4:14:14 AM

View Replies

Latest Reply

Kaniz
Community Manager

03-13-2024 4:32:41 AM

0 kudos

Hi @caldempsey, Thank you for providing detailed information about your setup and the issue you’re encountering with Spark writes to a Delta table. Let’s dive into this behavior and explore potential solutions. Access to Data Location: You’ve co...

0 kudos

03-13-2024 4:32:41 AM

by angel_ba • New Contributor II

03-10-2024 5:03:22 AM

316 Views
1 replies
0 kudos

File Trigger using azure file share in unity Catalog

Hello, I have got the unity catalog eanbled in my workspace. the file srae manually copied by customers in azure file share(domain joint account, wabs) on adhoc basis. I would like to add a file trigger on the job so that as soon as file arrives in t...

Data Engineering

316 Views
1 replies
0 kudos

03-10-2024 5:03:22 AM

View Replies

Latest Reply

Kaniz
Community Manager

03-13-2024 3:45:21 AM

0 kudos

Hi @angel_ba, Unity Catalog and Azure Data Lake Storage Gen2 (ADLS Gen2): Unity Catalog is a powerful feature in Azure Databricks that allows you to configure access to ADLS Gen2 and volumes for direct interaction with files. It simplifies the p...

0 kudos

03-13-2024 3:45:21 AM

by Andrewcon • New Contributor

03-10-2024 7:09:07 AM

586 Views
1 replies
0 kudos

Delta tables and YOLO computer vision tasks

Hi all,I would really appreciate if someone could help me out. I feel it’s both a data engineering and ML question.One thing we use at wo is YOLO for object detection. I’ve managed to run YOLO by loading data from the blob storage, but I’ve seen tha...

Data Engineering

computer vision

Delta table

YOLO

586 Views
1 replies
0 kudos

03-10-2024 7:09:07 AM

View Replies

Latest Reply

Kaniz
Community Manager

03-13-2024 3:43:28 AM

0 kudos

Hi @Andrewcon, Training computer vision models on Delta Live Tables in Databricks is an interesting challenge. Let’s break it down: Delta Live Tables: Delta Live Tables is a declarative framework for building reliable, maintainable, and testable ...

0 kudos

03-13-2024 3:43:28 AM

by MaOthman2110 • New Contributor

03-10-2024 7:10:55 AM

317 Views
1 replies
0 kudos

can't login back to databricks community

Hello,I'm experiencing difficulty logging into the Databricks community despite using the correct username and password. Additionally, when attempting to reset my password, I haven't received any email notifications.

Data Engineering

317 Views
1 replies
0 kudos

03-10-2024 7:10:55 AM

View Replies

Latest Reply

Kaniz
Community Manager

03-13-2024 3:37:32 AM

0 kudos

Hi @MaOthman2110, Thank you for reaching out! Let us look into this for you, and we'll circle back with an update.

0 kudos

03-13-2024 3:37:32 AM

by Jaynab_1 • New Contributor

03-10-2024 10:54:27 AM

247 Views
1 replies
0 kudos

Trying calculate Zonal_stats using mosaic and H3

I am trying to calculate Zonal_stats for raster data using mosaic and H3. Created dataframe from geometry data to H3 index. While previously I was calculating Zonal_stats using rasterio, tif file, geometry data in python which is slow. Now want to ex...

Data Engineering

247 Views
1 replies
0 kudos

03-10-2024 10:54:27 AM

View Replies

Latest Reply

Kaniz
Community Manager

03-13-2024 3:35:02 AM

0 kudos

Hi @Jaynab_1, Let’s explore how you can calculate zonal statistics using Mosaic and H3. While Mosaic itself doesn’t directly provide a built-in function for zonal statistics, we can leverage other tools and libraries to achieve this. Zonal Statis...

0 kudos

03-13-2024 3:35:02 AM

User

Count

1602

736

344

284

247

Databricks

Forum Posts

Databricks CLI how to start a job and pass the parameters?

Resolved! Spark job task fails with "java.lang.NoClassDefFoundError: org/apache/spark/SparkContext$"

Cant remove file on ADLS using dbutils.fs.rm because url contains illeagal character

TABLE_REDIRECTION_ERROR in AWS Athena After Databricks Upgrade to 14.3 LTS

How to do perform deep clone for data migration from one Datalake to another?

workspace deletion at Databricks recovery

Not able to deploy cluster in databricks community edition

'replaceWhere' clause in spark.write for a partitioned table

Can I generate a uuid4 column when I do a COPY INTO command?

Unable to upload a wheel file in Azure DevOps pipeline

Delta Lake Spark fails to write _delta_log via a Notebook without granting the Notebook data access

File Trigger using azure file share in unity Catalog

Delta tables and YOLO computer vision tasks

can't login back to databricks community

Trying calculate Zonal_stats using mosaic and H3

Best way to parse Google Analytics data in Databri...

DELTA_EXCEED_CHAR_VARCHAR_LIMIT

Not able to set run_as service_principal_name

Pyspark operations slowness in CLuster 14.3LTS as ...

[Databricks Assets Bundles] Workflow trigger on fi...