cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

dmart
by New Contributor III
  • 9066 Views
  • 12 replies
  • 0 kudos

can't delete 50TB of overpartitioned data from dbfs

I need to delete 50TB of data out of dfbs storage. It is overpartitioned and dbutils does not work. Also, limiting partition size and iterating over data to delete doesn't work. Azure locks access from storage from the resource group permissions and ...

  • 9066 Views
  • 12 replies
  • 0 kudos
Latest Reply
dmart
New Contributor III
  • 0 kudos

For anyone else with this issue, there is no solution other than deleting the whole databricks workspace which then deletes all the resources locked up in the managed resource group. The data could not be deleted in any other way, not even by Microso...

  • 0 kudos
11 More Replies
demost11
by New Contributor II
  • 1762 Views
  • 0 replies
  • 0 kudos

Databricks Connect Passthrough

I'm using the Databricks Connect VS Code plugin. It's cool how it figures out what things need to be run on the cluster vs. run locally. However, is it possible to force it to run specific Python statements remotely instead of locally?For context, th...

  • 1762 Views
  • 0 replies
  • 0 kudos
IshaBudhiraja
by New Contributor II
  • 1794 Views
  • 0 replies
  • 0 kudos

Installation of external libraries(wheel file) in Data bricks through synapse using new job cluster

Aim-Installation of external libraries(wheel file) in Data bricks through synapse using new job clusterSolution- I have followed the below steps:I have created a pipeline in synapse that consists of a notebook activity that is using a new job cluster...

  • 1794 Views
  • 0 replies
  • 0 kudos
Dikshant
by New Contributor
  • 2600 Views
  • 0 replies
  • 0 kudos

SchemaEvolutionMode exception in Databricks 14.2

I am unable to display the below stream after reading it.df= spark.readStream.format("cloudFiles")\.option("cloudFiles.format", "csv")\.option("header", "true")\.option("delimiter", "\t")\.option("inferSchema", "true")\.option("cloudFiles.connectionS...

Data Engineering
schemaEvolutionMode
  • 2600 Views
  • 0 replies
  • 0 kudos
MBV3
by Contributor
  • 16887 Views
  • 5 replies
  • 7 kudos

Resolved! External table from parquet partition

Hi,I have data in parquet format in GCS buckets partitioned by name eg. gs://mybucket/name=ABCD/I am trying to create a table in Databaricks as followsDROP TABLE IF EXISTS name_test; CREATE TABLE name_testUSING parquetLOCATION "gs://mybucket/name=*/...

  • 16887 Views
  • 5 replies
  • 7 kudos
Latest Reply
Pat
Esteemed Contributor
  • 7 kudos

Hi @M Baig​ ,the error doesn't tell me much, but you could try:CREATE TABLE name_test USING parquet PARTITIONED BY ( name STRING) LOCATION "gs://mybucket/";

  • 7 kudos
4 More Replies
ac0
by Contributor
  • 2582 Views
  • 0 replies
  • 0 kudos

Get size of metastore specifically

Currently my Databricks Metastore is in the the same location as the data for my production catalog. We are moving the data to a separate storage account. In advance of this, I'm curious if there is a way to determine the size of the metastore itself...

  • 2582 Views
  • 0 replies
  • 0 kudos
DylanS
by New Contributor II
  • 6567 Views
  • 7 replies
  • 6 kudos

FileNotFoundError: [Errno 2] No such file or directory: 'pylsp'

We are intermittently experiencing the below issue when running mundane code in our databricks notebook environment using 13.3 LTS runtime, with a compute pool with r6id.large on-demand instances, using local storage.We first noticed this late last w...

DylanS_0-1707756410914.png
  • 6567 Views
  • 7 replies
  • 6 kudos
Latest Reply
engixcmt
New Contributor II
  • 6 kudos

Hello @Navya_R ,We are facing a similar issue when using 14.3LTS with DCSFor us, certain Global Inits are not getting applied. Is there a patch we can use for 14.3 LTS as well?

  • 6 kudos
6 More Replies
SandeepG
by New Contributor
  • 4126 Views
  • 1 replies
  • 0 kudos

not able to create temporary tables in unity catalog

We are using a unity catalog environment and when trying to create a temporary table the statement errored out. 

SandeepG_0-1710776912989.png
  • 4126 Views
  • 1 replies
  • 0 kudos
Latest Reply
Sampath_Kumar
New Contributor II
  • 0 kudos

Hi @SandeepG Could you please share the need of a temporary table?Here are the possible waysTables:External Tables: Tables can be created on top of files which are externally located.Managed Tables: The usual tables which will be stored and managed a...

  • 0 kudos
Gilg
by Contributor II
  • 2792 Views
  • 1 replies
  • 0 kudos

Autoloader - File Notification mode

Hi All,I have set up a DLT pipeline that is using Autoloader in a file notification mode.Everything runs smoothly for the first time. However, it seems like the next micro-batch did not trigger as I can see some events coming in the queue.But if I lo...

Gilg_0-1710827649089.png Gilg_1-1710827662118.png
  • 2792 Views
  • 1 replies
  • 0 kudos
cpd
by New Contributor II
  • 4375 Views
  • 1 replies
  • 0 kudos

Ingesting geospatial data into a table

I'm just getting started with Databricks and wondering if it is possible to ingest a GeoJSON or GeoParquet file into a new table without writing code? My goal here is to load vector data into a table and perform H3 polyfill operations on all the vect...

  • 4375 Views
  • 1 replies
  • 0 kudos
Latest Reply
cpd
New Contributor II
  • 0 kudos

Thank you @Retired_mod - much appreciated!

  • 0 kudos
ashish577
by New Contributor III
  • 2687 Views
  • 0 replies
  • 0 kudos

How do we pass parameters which have a "," with bundle run ?

So I have a query "select col1, col2 from table" that I need to pass as a parameter to a databricks job that I am triggering through the bundle run command. Issue is, when I pass this via --params="query=select col1, col2 from table" it splits it bas...

  • 2687 Views
  • 0 replies
  • 0 kudos
Brichj
by New Contributor II
  • 2974 Views
  • 2 replies
  • 0 kudos

%run ../Includes/Classroom-Setup-02.1

I ran the code in the cell as it was given in the presentation. But it failed. Can someone please help?The presentation is the second lesson in the second model of Data Engineering Associate exam prep.

  • 2974 Views
  • 2 replies
  • 0 kudos
Latest Reply
Brichj
New Contributor II
  • 0 kudos

Thanks Ajay-Pandey!This is error that I keep getting when I run the following: %run ./Includes/Classroom-Setup-02.3LI have run dbutils.library.restartPython(), but it did not help.Note: you may need to restart the kernel using dbutils.library.restart...

  • 0 kudos
1 More Replies
MikeGo
by Valued Contributor
  • 3309 Views
  • 3 replies
  • 0 kudos

Inconsistent behavior when displaying chart in notebook

Hi, I'm trying to create some 3D charts. With the same code and same cluster, sometimes it can show, sometimes it cannot. Previously it cannot display, but last week I opened a notebook with failed run and found the result can be shown by itself (as ...

  • 3309 Views
  • 3 replies
  • 0 kudos
Latest Reply
MikeGo
Valued Contributor
  • 0 kudos

Also, with same code, same browser, different workspaces, one works, other one not. In the notebook with "script error", if I "Export cell" and get its iframe html and use displayHTML to display it, it works, so this means the JS and HTML inside is o...

  • 0 kudos
2 More Replies
Labels