Data Engineering

Forum Posts

Sorted by:

by pshuk • New Contributor III

03-20-2024 12:18:20 PM

6269 Views
2 replies
0 kudos

Copying files from dev environment to prod environment

Hi,Is there a quick and easy way to copy files between different environments? I have copied a large number of files on my dev environment (unity catalog) and want to copy them over to production environment. Instead of doing it from scratch, can I j...

Data Engineering

6269 Views
2 replies
0 kudos

03-20-2024 12:18:20 PM

View Replies

Latest Reply

Hubert-Dudek
Databricks MVP

03-20-2024 12:53:20 PM

0 kudos

If you want to copy files in Azure, ADF is usually the fastest option (for example TB of csvs, parquets). If you want to copy tables, just use CLONE. If it is files with code just use Repos and branches.

0 kudos

03-20-2024 12:53:20 PM

1 More Replies

by MarinD • New Contributor II

03-20-2024 3:05:37 PM

2724 Views
1 replies
0 kudos

Asset bundle pipelines - target schema and catalog

Do asset bundles support DLT pipelines unity catalog as a destination? How to specify catalog and target schema?

Data Engineering

2724 Views
1 replies
0 kudos

03-20-2024 3:05:37 PM

View Replies

by aseufert • New Contributor III

03-31-2022 8:31:20 AM

11787 Views
2 replies
3 kudos

Git Stash

Looked through some previous posts and documentation and couldn't find anything related to use of Git stash in Databricks Repos. Perhaps I missed it. I also don't see an option in the UI.Does anyone know if there's a way to stash changes either in th...

Data Engineering

11787 Views
2 replies
3 kudos

03-31-2022 8:31:20 AM

View Replies

Latest Reply

javierbg
New Contributor III

03-21-2024 4:07:15 AM

3 kudos

This is actually a big hurdle when trying to switch between working in two different branches, it would be a welcome addition to the Databricks IDE.

3 kudos

03-21-2024 4:07:15 AM

1 More Replies

by test_123 • New Contributor

03-21-2024 3:08:27 AM

6599 Views
0 replies
0 kudos

Schema evolution is not working for XML file

I have used .option("cloudFiles.schemaEvolutionMode", "addNewColumns")\ for newly added property in xml file but autoloader not detected the changes. As per .option("cloudFiles.schemaEvolutionMode", "addNewColumns")\ behavior it has failed at first t...

Data Engineering

6599 Views
0 replies
0 kudos

03-21-2024 3:08:27 AM

by JohanS • New Contributor III

03-21-2024 1:08:30 AM

2808 Views
1 replies
0 kudos

Resolved! Container Service Docker images fail when a pip package is installed

I'm building my own Docker images to use for a cluster. The problem is that the only image I seem to be able to run is the official base image "databricksruntime/python:13.3-LTS". If I install a pip package, I get the following on standard error:/dat...

Data Engineering

container service

Docker

pip

python

2808 Views
1 replies
0 kudos

03-21-2024 1:08:30 AM

View Replies

Latest Reply

JohanS
New Contributor III

03-21-2024 2:51:43 AM

0 kudos

I found the culprit: --ignore-installed upgraded matplotlib too much, and broke it.

0 kudos

03-21-2024 2:51:43 AM

by Arun2151 • New Contributor II

03-20-2024 5:54:25 PM

2563 Views
1 replies
2 kudos

spark.sql query is executing from the except block even though the try block is succeeded

I have developed a azure databricks notebook where data will be copied from landing zone to STG delta table, used Try and except blocks in the code to catch the errors, if their is an error the except block will catch the error message. In the except...

Data Engineering

2563 Views
1 replies
2 kudos

03-20-2024 5:54:25 PM

View Replies

Latest Reply

Arun2151
New Contributor II

03-20-2024 6:29:12 PM

2 kudos

below is my code

2 kudos

03-20-2024 6:29:12 PM

by Hubert-Dudek • Databricks MVP

03-20-2024 12:56:03 PM

1707 Views
1 replies
1 kudos

R2 as external location

R2 (egress-free) can now be quickly registered as an external location. You can use it not only for Delta Sharing! #databricks

Data Engineering

1707 Views
1 replies
1 kudos

03-20-2024 12:56:03 PM

View Replies

Latest Reply

jose_gonzalez
Databricks Employee

03-20-2024 2:15:05 PM

1 kudos

Thank you for sharing this @Hubert-Dudek!!!

1 kudos

03-20-2024 2:15:05 PM

by dmart • New Contributor III

01-12-2024 6:20:37 PM

8580 Views
12 replies
0 kudos

can't delete 50TB of overpartitioned data from dbfs

I need to delete 50TB of data out of dfbs storage. It is overpartitioned and dbutils does not work. Also, limiting partition size and iterating over data to delete doesn't work. Azure locks access from storage from the resource group permissions and ...

Data Engineering

8580 Views
12 replies
0 kudos

01-12-2024 6:20:37 PM

View Replies

Latest Reply

dmart
New Contributor III

03-20-2024 2:00:46 PM

0 kudos

For anyone else with this issue, there is no solution other than deleting the whole databricks workspace which then deletes all the resources locked up in the managed resource group. The data could not be deleted in any other way, not even by Microso...

0 kudos

03-20-2024 2:00:46 PM

11 More Replies

by demost11 • New Contributor II

03-20-2024 12:55:27 PM

1681 Views
0 replies
0 kudos

Databricks Connect Passthrough

I'm using the Databricks Connect VS Code plugin. It's cool how it figures out what things need to be run on the cluster vs. run locally. However, is it possible to force it to run specific Python statements remotely instead of locally?For context, th...

Data Engineering

1681 Views
0 replies
0 kudos

03-20-2024 12:55:27 PM

by IshaBudhiraja • New Contributor II

03-20-2024 10:56:34 AM

1735 Views
0 replies
0 kudos

Installation of external libraries(wheel file) in Data bricks through synapse using new job cluster

Aim-Installation of external libraries(wheel file) in Data bricks through synapse using new job clusterSolution- I have followed the below steps:I have created a pipeline in synapse that consists of a notebook activity that is using a new job cluster...

Data Engineering

1735 Views
0 replies
0 kudos

03-20-2024 10:56:34 AM

by Dikshant • New Contributor

03-20-2024 10:08:31 AM

2473 Views
0 replies
0 kudos

SchemaEvolutionMode exception in Databricks 14.2

I am unable to display the below stream after reading it.df= spark.readStream.format("cloudFiles")\.option("cloudFiles.format", "csv")\.option("header", "true")\.option("delimiter", "\t")\.option("inferSchema", "true")\.option("cloudFiles.connectionS...

Data Engineering

schemaEvolutionMode

2473 Views
0 replies
0 kudos

03-20-2024 10:08:31 AM

by MBV3 • Contributor

10-31-2022 9:46:03 AM

16271 Views
5 replies
7 kudos

Resolved! External table from parquet partition

Hi,I have data in parquet format in GCS buckets partitioned by name eg. gs://mybucket/name=ABCD/I am trying to create a table in Databaricks as followsDROP TABLE IF EXISTS name_test; CREATE TABLE name_testUSING parquetLOCATION "gs://mybucket/name=*/...

Data Engineering

16271 Views
5 replies
7 kudos

10-31-2022 9:46:03 AM

View Replies

Latest Reply

Pat
Esteemed Contributor

11-01-2022 1:46:26 AM

7 kudos

Hi @M Baig ,the error doesn't tell me much, but you could try:CREATE TABLE name_test USING parquet PARTITIONED BY ( name STRING) LOCATION "gs://mybucket/";

7 kudos

11-01-2022 1:46:26 AM

4 More Replies

by ac0 • Contributor

03-20-2024 8:16:27 AM

2508 Views
0 replies
0 kudos

Get size of metastore specifically

Currently my Databricks Metastore is in the the same location as the data for my production catalog. We are moving the data to a separate storage account. In advance of this, I'm curious if there is a way to determine the size of the metastore itself...

Data Engineering

2508 Views
0 replies
0 kudos

03-20-2024 8:16:27 AM

by William_Scardua • Valued Contributor

03-20-2024 6:03:04 AM

3702 Views
0 replies
0 kudos

How to no round formating

Hy guys,I need to format the decimal values but I can`t round thenhave any idea ?thank you

Data Engineering

3702 Views
0 replies
0 kudos

03-20-2024 6:03:04 AM

by DylanS • New Contributor II

02-12-2024 8:48:42 AM

6277 Views
7 replies
6 kudos

FileNotFoundError: [Errno 2] No such file or directory: 'pylsp'

We are intermittently experiencing the below issue when running mundane code in our databricks notebook environment using 13.3 LTS runtime, with a compute pool with r6id.large on-demand instances, using local storage.We first noticed this late last w...

Data Engineering

6277 Views
7 replies
6 kudos

02-12-2024 8:48:42 AM

View Replies

Latest Reply

engixcmt
New Contributor II

03-19-2024 11:56:29 AM

6 kudos

Hello @Navya_R ,We are facing a similar issue when using 14.3LTS with DCSFor us, certain Global Inits are not getting applied. Is there a patch we can use for 14.3 LTS as well?

6 kudos

03-19-2024 11:56:29 AM

6 More Replies

Databricks Community

Forum Posts

Copying files from dev environment to prod environment

Asset bundle pipelines - target schema and catalog

Git Stash

Schema evolution is not working for XML file

Resolved! Container Service Docker images fail when a pip package is installed

spark.sql query is executing from the except block even though the try block is succeeded

R2 as external location

can't delete 50TB of overpartitioned data from dbfs

Databricks Connect Passthrough

Installation of external libraries(wheel file) in Data bricks through synapse using new job cluster

SchemaEvolutionMode exception in Databricks 14.2

Resolved! External table from parquet partition

Get size of metastore specifically

How to no round formating

FileNotFoundError: [Errno 2] No such file or directory: 'pylsp'

File Arrival Trigger - Multiple tables

Issue while handling Deletes and Inserts in Struct...

DLT with CDC and schema changes in streaming pipel...

how to update not tracked column only in new row v...

Databricks Cost Estimation Template