cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

pshuk
by New Contributor III
  • 6269 Views
  • 2 replies
  • 0 kudos

Copying files from dev environment to prod environment

Hi,Is there a quick and easy way to copy files between different environments? I have copied a large number of files on my dev environment (unity catalog) and want to copy them over to production environment. Instead of doing it from scratch, can I j...

  • 6269 Views
  • 2 replies
  • 0 kudos
Latest Reply
Hubert-Dudek
Databricks MVP
  • 0 kudos

If you want to copy files in Azure, ADF is usually the fastest option (for example TB of csvs, parquets). If you want to copy tables, just use CLONE. If it is files with code just use Repos and branches.

  • 0 kudos
1 More Replies
aseufert
by New Contributor III
  • 11787 Views
  • 2 replies
  • 3 kudos

Git Stash

Looked through some previous posts and documentation and couldn't find anything related to use of Git stash in Databricks Repos. Perhaps I missed it. I also don't see an option in the UI.Does anyone know if there's a way to stash changes either in th...

  • 11787 Views
  • 2 replies
  • 3 kudos
Latest Reply
javierbg
New Contributor III
  • 3 kudos

This is actually a big hurdle when trying to switch between working in two different branches, it would be a welcome addition to the Databricks IDE.

  • 3 kudos
1 More Replies
test_123
by New Contributor
  • 6599 Views
  • 0 replies
  • 0 kudos

Schema evolution is not working for XML file

I have used .option("cloudFiles.schemaEvolutionMode", "addNewColumns")\ for newly added property in xml file but autoloader not detected the changes. As per .option("cloudFiles.schemaEvolutionMode", "addNewColumns")\ behavior it has failed at first t...

  • 6599 Views
  • 0 replies
  • 0 kudos
JohanS
by New Contributor III
  • 2808 Views
  • 1 replies
  • 0 kudos

Resolved! Container Service Docker images fail when a pip package is installed

I'm building my own Docker images to use for a cluster. The problem is that the only image I seem to be able to run is the official base image "databricksruntime/python:13.3-LTS". If I install a pip package, I get the following on standard error:/dat...

Data Engineering
container service
Docker
pip
python
  • 2808 Views
  • 1 replies
  • 0 kudos
Latest Reply
JohanS
New Contributor III
  • 0 kudos

I found the culprit: --ignore-installed upgraded matplotlib too much, and broke it.

  • 0 kudos
Arun2151
by New Contributor II
  • 2563 Views
  • 1 replies
  • 2 kudos

spark.sql query is executing from the except block even though the try block is succeeded

I have developed a azure databricks notebook where data will be copied from landing zone to STG delta table, used Try and except blocks in the code to catch the errors, if their is an error the except block will catch the error message. In the except...

  • 2563 Views
  • 1 replies
  • 2 kudos
Latest Reply
Arun2151
New Contributor II
  • 2 kudos

below is my code

  • 2 kudos
Hubert-Dudek
by Databricks MVP
  • 1707 Views
  • 1 replies
  • 1 kudos

R2 as external location

R2 (egress-free) can now be quickly registered as an external location. You can use it not only for Delta Sharing! #databricks

r2.png
  • 1707 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Thank you for sharing this @Hubert-Dudek!!!

  • 1 kudos
dmart
by New Contributor III
  • 8580 Views
  • 12 replies
  • 0 kudos

can't delete 50TB of overpartitioned data from dbfs

I need to delete 50TB of data out of dfbs storage. It is overpartitioned and dbutils does not work. Also, limiting partition size and iterating over data to delete doesn't work. Azure locks access from storage from the resource group permissions and ...

  • 8580 Views
  • 12 replies
  • 0 kudos
Latest Reply
dmart
New Contributor III
  • 0 kudos

For anyone else with this issue, there is no solution other than deleting the whole databricks workspace which then deletes all the resources locked up in the managed resource group. The data could not be deleted in any other way, not even by Microso...

  • 0 kudos
11 More Replies
demost11
by New Contributor II
  • 1681 Views
  • 0 replies
  • 0 kudos

Databricks Connect Passthrough

I'm using the Databricks Connect VS Code plugin. It's cool how it figures out what things need to be run on the cluster vs. run locally. However, is it possible to force it to run specific Python statements remotely instead of locally?For context, th...

  • 1681 Views
  • 0 replies
  • 0 kudos
IshaBudhiraja
by New Contributor II
  • 1735 Views
  • 0 replies
  • 0 kudos

Installation of external libraries(wheel file) in Data bricks through synapse using new job cluster

Aim-Installation of external libraries(wheel file) in Data bricks through synapse using new job clusterSolution- I have followed the below steps:I have created a pipeline in synapse that consists of a notebook activity that is using a new job cluster...

  • 1735 Views
  • 0 replies
  • 0 kudos
Dikshant
by New Contributor
  • 2473 Views
  • 0 replies
  • 0 kudos

SchemaEvolutionMode exception in Databricks 14.2

I am unable to display the below stream after reading it.df= spark.readStream.format("cloudFiles")\.option("cloudFiles.format", "csv")\.option("header", "true")\.option("delimiter", "\t")\.option("inferSchema", "true")\.option("cloudFiles.connectionS...

Data Engineering
schemaEvolutionMode
  • 2473 Views
  • 0 replies
  • 0 kudos
MBV3
by Contributor
  • 16271 Views
  • 5 replies
  • 7 kudos

Resolved! External table from parquet partition

Hi,I have data in parquet format in GCS buckets partitioned by name eg. gs://mybucket/name=ABCD/I am trying to create a table in Databaricks as followsDROP TABLE IF EXISTS name_test; CREATE TABLE name_testUSING parquetLOCATION "gs://mybucket/name=*/...

  • 16271 Views
  • 5 replies
  • 7 kudos
Latest Reply
Pat
Esteemed Contributor
  • 7 kudos

Hi @M Baig​ ,the error doesn't tell me much, but you could try:CREATE TABLE name_test USING parquet PARTITIONED BY ( name STRING) LOCATION "gs://mybucket/";

  • 7 kudos
4 More Replies
ac0
by Contributor
  • 2508 Views
  • 0 replies
  • 0 kudos

Get size of metastore specifically

Currently my Databricks Metastore is in the the same location as the data for my production catalog. We are moving the data to a separate storage account. In advance of this, I'm curious if there is a way to determine the size of the metastore itself...

  • 2508 Views
  • 0 replies
  • 0 kudos
DylanS
by New Contributor II
  • 6277 Views
  • 7 replies
  • 6 kudos

FileNotFoundError: [Errno 2] No such file or directory: 'pylsp'

We are intermittently experiencing the below issue when running mundane code in our databricks notebook environment using 13.3 LTS runtime, with a compute pool with r6id.large on-demand instances, using local storage.We first noticed this late last w...

DylanS_0-1707756410914.png
  • 6277 Views
  • 7 replies
  • 6 kudos
Latest Reply
engixcmt
New Contributor II
  • 6 kudos

Hello @Navya_R ,We are facing a similar issue when using 14.3LTS with DCSFor us, certain Global Inits are not getting applied. Is there a patch we can use for 14.3 LTS as well?

  • 6 kudos
6 More Replies
Labels