cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Darian
by New Contributor II
  • 1495 Views
  • 2 replies
  • 0 kudos

Delta Live table getting error of garbage collection after running few days

Hi, i am using delta live table in continuous mode for a real time streaming data pipeline. After running the pipeline like 2-3 days i am getting this garbage collection error:Driver/10.15.0.73 paused the JVM process 68 seconds during the past 120 se...

Darian_0-1714426883477.png Darian_1-1714426964675.png
  • 1495 Views
  • 2 replies
  • 0 kudos
Latest Reply
Darian
New Contributor II
  • 0 kudos

Here are the metrics:The size/type:Thanks!   

  • 0 kudos
1 More Replies
al_joe
by Contributor
  • 11594 Views
  • 5 replies
  • 3 kudos

Resolved! Split a code cell at cursor position? Add a cell above/below?

In JupyterLab notebooks, we can --In edit mode, you can press Ctrl+Shift+Minus to split the current cell into two at the cursor position In command mode, you can click A or B to add a cell Above or Below the current cellare there equivalent shortcuts...

  • 11594 Views
  • 5 replies
  • 3 kudos
Latest Reply
DavidKxx
Contributor
  • 3 kudos

What's the status of the ctrl-alt-minus shortcut for splitting a cell?  That keyboard combination does absolutely nothing in my interface (running Databricks via Chrome on GCP).

  • 3 kudos
4 More Replies
Lazloo
by New Contributor III
  • 16836 Views
  • 6 replies
  • 4 kudos

databricks-connect version 13: spark-class2.cmd not found

I install the newest version "databricks-connect==13.0.0". Now get the issue    Command C:\Users\Y\AppData\Local\pypoetry\Cache\virtualenvs\X-py3.9\Lib\site-packages\pyspark\bin\spark-class2.cmd"" not found   konnte nicht gefunden werden.   Traceback...

  • 16836 Views
  • 6 replies
  • 4 kudos
Latest Reply
Susumu_Asaga
New Contributor II
  • 4 kudos

Use this code:from databricks.connect import DatabricksSession spark = DatabricksSession.builder.getOrCreate() 

  • 4 kudos
5 More Replies
Phani1
by Valued Contributor II
  • 1168 Views
  • 0 replies
  • 0 kudos

Databricks cell-level code parallel execution through the Python threading library

Hi Team,We are currently planning to  implement Databricks cell-level code parallel execution through the Python threading library. We are interested in comprehending the resource consumption and allocation process from the cluster. Are there any pot...

  • 1168 Views
  • 0 replies
  • 0 kudos
jitesh
by New Contributor
  • 945 Views
  • 0 replies
  • 0 kudos

Code reusability for silver table transformations

How/how many databricks notebooks should be created to populate multiple silver delta tables, all having different and complex transformations ? What's the best practice -1. create a notebook each for a silver table ?2. push SQL transformation logic ...

  • 945 Views
  • 0 replies
  • 0 kudos
Ruby8376
by Valued Contributor
  • 1373 Views
  • 1 replies
  • 0 kudos

Databricks sql warehouse has Serverless compute as a public preview.

There is a risk form infosec as it is processed in the control plane shared with other azure clients. s there any control to mitigate the risk?

  • 1373 Views
  • 1 replies
  • 0 kudos
Latest Reply
PL_db
Databricks Employee
  • 0 kudos

You can find more information on that topic here. "With Databricks, your serverless workloads are protected by multiple layers of security. These security layers form the foundation of Databricks’ commitment to providing a secure and reliable environ...

  • 0 kudos
astrobil
by New Contributor II
  • 1019 Views
  • 1 replies
  • 0 kudos

Tab Stops Indenting in SQL Editor

I am utilizing Databricks via Azure, and I've been consistently experiencing an issue with the SQL Editor. The tab button, instead of indenting, redirects my cursor to seemingly random parts of the page. This problem has persisted since I began using...

  • 1019 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

which DBR version are you using? which web browser are you using?

  • 0 kudos
kartikmnc
by New Contributor
  • 1133 Views
  • 1 replies
  • 1 kudos

Regarding Exam got Suspended at middle without any reason.

Hi Team,My Databricks Certified Data Engineer Associate exam got suspended on 17th December and it is in progress state.I was there continuously in front of the camera and suddenly the alert appeared, and support person asked me to show the desk and ...

  • 1133 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Adding @Retired_mod for visibility on this request

  • 1 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 996 Views
  • 1 replies
  • 1 kudos

How much USD are you spending on Databricks?

Join two system tables and get exactly how much USD you are spending.The short version of the query: SELECT u.usage_date, u.sku_name, SUM(u.usage_quantity * p.pricing.default) AS total_spent, p.currency_code FROM system.billing....

system_pig.png
  • 996 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Thank you for sharing this information @Hubert-Dudek 

  • 1 kudos
Fresher
by New Contributor II
  • 873 Views
  • 1 replies
  • 0 kudos

Query is taking too long to run

I have two clusters. Cluster A(spark cluster) and cluster B(SQL warehouse). whenever I try to run a particular query using cluster B, it works fine but whenever I try to run same query using cluster A. It's taking time and never show the output

  • 873 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Check the physical query plan of the query you are running. Also, check the Spark UI to identify where is taking time and why.

  • 0 kudos
shanebo425
by New Contributor III
  • 1145 Views
  • 1 replies
  • 0 kudos

Databricks OutOfMemory error on code that previously worked without issue

I have a notebook in Azure Databricks that does some transformations on a bronze tier table and inserts the transformed data into a silver tier table. This notebook is used to do an initial load of the data from our existing system into our new datal...

  • 1145 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Please review your Spark UI from the old job execution versus the new job execution. You might need to check if the data volume has increase and that could be the reason of the OOM

  • 0 kudos
PrashantAghara
by New Contributor II
  • 1267 Views
  • 1 replies
  • 0 kudos

org.apache.spark.SparkException: Job aborted due to stage failure when writing to Cosmos

I am writing data to cosmos DB using Python & Spark on DatabricksI am getting below error :org.apache.spark.SparkException: Job aborted due to stage failure: Authorized committer (attemptNumber=0, stage=192, partition=105) failed; but task commit suc...

  • 1267 Views
  • 1 replies
  • 0 kudos
Latest Reply
PrashantAghara
New Contributor II
  • 0 kudos

The configs are for cluster:Worker Type & Driver type : Standard_D16ads_v5RUs for Cosmos : 1.5L

  • 0 kudos
DC3
by New Contributor II
  • 2297 Views
  • 2 replies
  • 0 kudos

Unable to access unity catalog volume via /Volumes in notebook

I have set up a volume in unity catalog in the format catalog/schema/volume, and granted all permissions to all users on the catalog, schema and volume.From the notebook I can see the /Volumes directory in the root of the file system but am unable to...

  • 2297 Views
  • 2 replies
  • 0 kudos
Latest Reply
DC3
New Contributor II
  • 0 kudos

Thanks for your comments. The problem turned out to be the compute resource not having unity catalog enabled.

  • 0 kudos
1 More Replies
Sagas
by New Contributor II
  • 1010 Views
  • 1 replies
  • 0 kudos

SparkR or sparklyr not showing history

Hi,for some reason Azure Databricks doesn't show History if the data is saved with SparkR (2 in the figure below) or Sparklyr (3), but it does show it with Data Ingestion (0) or with PySpark (1). Is this a known bug or am I doing something wrong? Is ...

Databricks_history.PNG SparkR.PNG Sparklyr.PNG
Data Engineering
sparklyr
SparkR
  • 1010 Views
  • 1 replies
  • 0 kudos
patrickw
by New Contributor II
  • 7828 Views
  • 2 replies
  • 0 kudos

connect timed out error - Connecting to SQL Server from Databricks

I am getting a connect timed out error when attempting to access a sql server. I can successfully ping the server from Databricks. I have used the jdbc connection and the sqlserver included driver and both result in the same error. I have also attemp...

  • 7828 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Can you run the following command in a notebook using the same cluster you are using to connect:%sh nc -vz <hostname> <port> This test will confirm us if we are able to communicate with the SQL server by using the port you are defining to connect. If...

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels