cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Phani1
by Valued Contributor II
  • 963 Views
  • 0 replies
  • 0 kudos

Databricks cell-level code parallel execution through the Python threading library

Hi Team,We are currently planning to  implement Databricks cell-level code parallel execution through the Python threading library. We are interested in comprehending the resource consumption and allocation process from the cluster. Are there any pot...

  • 963 Views
  • 0 replies
  • 0 kudos
jitesh
by New Contributor
  • 736 Views
  • 0 replies
  • 0 kudos

Code reusability for silver table transformations

How/how many databricks notebooks should be created to populate multiple silver delta tables, all having different and complex transformations ? What's the best practice -1. create a notebook each for a silver table ?2. push SQL transformation logic ...

  • 736 Views
  • 0 replies
  • 0 kudos
Ruby8376
by Valued Contributor
  • 1023 Views
  • 1 replies
  • 0 kudos

Databricks sql warehouse has Serverless compute as a public preview.

There is a risk form infosec as it is processed in the control plane shared with other azure clients. s there any control to mitigate the risk?

  • 1023 Views
  • 1 replies
  • 0 kudos
Latest Reply
PL_db
Databricks Employee
  • 0 kudos

You can find more information on that topic here. "With Databricks, your serverless workloads are protected by multiple layers of security. These security layers form the foundation of Databricks’ commitment to providing a secure and reliable environ...

  • 0 kudos
astrobil
by New Contributor II
  • 855 Views
  • 1 replies
  • 0 kudos

Tab Stops Indenting in SQL Editor

I am utilizing Databricks via Azure, and I've been consistently experiencing an issue with the SQL Editor. The tab button, instead of indenting, redirects my cursor to seemingly random parts of the page. This problem has persisted since I began using...

  • 855 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

which DBR version are you using? which web browser are you using?

  • 0 kudos
kartikmnc
by New Contributor
  • 994 Views
  • 1 replies
  • 1 kudos

Regarding Exam got Suspended at middle without any reason.

Hi Team,My Databricks Certified Data Engineer Associate exam got suspended on 17th December and it is in progress state.I was there continuously in front of the camera and suddenly the alert appeared, and support person asked me to show the desk and ...

  • 994 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Adding @Retired_mod for visibility on this request

  • 1 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 851 Views
  • 1 replies
  • 1 kudos

How much USD are you spending on Databricks?

Join two system tables and get exactly how much USD you are spending.The short version of the query: SELECT u.usage_date, u.sku_name, SUM(u.usage_quantity * p.pricing.default) AS total_spent, p.currency_code FROM system.billing....

system_pig.png
  • 851 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Thank you for sharing this information @Hubert-Dudek 

  • 1 kudos
Fresher
by New Contributor II
  • 721 Views
  • 1 replies
  • 0 kudos

Query is taking too long to run

I have two clusters. Cluster A(spark cluster) and cluster B(SQL warehouse). whenever I try to run a particular query using cluster B, it works fine but whenever I try to run same query using cluster A. It's taking time and never show the output

  • 721 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Check the physical query plan of the query you are running. Also, check the Spark UI to identify where is taking time and why.

  • 0 kudos
shanebo425
by New Contributor III
  • 836 Views
  • 1 replies
  • 0 kudos

Databricks OutOfMemory error on code that previously worked without issue

I have a notebook in Azure Databricks that does some transformations on a bronze tier table and inserts the transformed data into a silver tier table. This notebook is used to do an initial load of the data from our existing system into our new datal...

  • 836 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Please review your Spark UI from the old job execution versus the new job execution. You might need to check if the data volume has increase and that could be the reason of the OOM

  • 0 kudos
PrashantAghara
by New Contributor II
  • 1067 Views
  • 1 replies
  • 0 kudos

org.apache.spark.SparkException: Job aborted due to stage failure when writing to Cosmos

I am writing data to cosmos DB using Python & Spark on DatabricksI am getting below error :org.apache.spark.SparkException: Job aborted due to stage failure: Authorized committer (attemptNumber=0, stage=192, partition=105) failed; but task commit suc...

  • 1067 Views
  • 1 replies
  • 0 kudos
Latest Reply
PrashantAghara
New Contributor II
  • 0 kudos

The configs are for cluster:Worker Type & Driver type : Standard_D16ads_v5RUs for Cosmos : 1.5L

  • 0 kudos
DC3
by New Contributor II
  • 1706 Views
  • 2 replies
  • 0 kudos

Unable to access unity catalog volume via /Volumes in notebook

I have set up a volume in unity catalog in the format catalog/schema/volume, and granted all permissions to all users on the catalog, schema and volume.From the notebook I can see the /Volumes directory in the root of the file system but am unable to...

  • 1706 Views
  • 2 replies
  • 0 kudos
Latest Reply
DC3
New Contributor II
  • 0 kudos

Thanks for your comments. The problem turned out to be the compute resource not having unity catalog enabled.

  • 0 kudos
1 More Replies
Sagas
by New Contributor II
  • 847 Views
  • 1 replies
  • 0 kudos

SparkR or sparklyr not showing history

Hi,for some reason Azure Databricks doesn't show History if the data is saved with SparkR (2 in the figure below) or Sparklyr (3), but it does show it with Data Ingestion (0) or with PySpark (1). Is this a known bug or am I doing something wrong? Is ...

Databricks_history.PNG SparkR.PNG Sparklyr.PNG
Data Engineering
sparklyr
SparkR
  • 847 Views
  • 1 replies
  • 0 kudos
patrickw
by New Contributor II
  • 6129 Views
  • 2 replies
  • 0 kudos

connect timed out error - Connecting to SQL Server from Databricks

I am getting a connect timed out error when attempting to access a sql server. I can successfully ping the server from Databricks. I have used the jdbc connection and the sqlserver included driver and both result in the same error. I have also attemp...

  • 6129 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Can you run the following command in a notebook using the same cluster you are using to connect:%sh nc -vz <hostname> <port> This test will confirm us if we are able to communicate with the SQL server by using the port you are defining to connect. If...

  • 0 kudos
1 More Replies
madrhr
by New Contributor III
  • 2631 Views
  • 3 replies
  • 3 kudos

Resolved! SparkContext lost when running %sh script.py

I need to execute a .py file in Databricks from a notebook (with arguments which for simplicity i exclude here). For this i am using:%sh script.pyscript.py:from pyspark import SparkContext def main(): sc = SparkContext.getOrCreate() print(sc...

Data Engineering
%sh
.py
bash shell
SparkContext
SparkShell
  • 2631 Views
  • 3 replies
  • 3 kudos
Latest Reply
madrhr
New Contributor III
  • 3 kudos

I got it eventually working with a combination of:from databricks.sdk.runtime import *spark.sparkContext.addPyFile("/path/to/your/file")sys.path.append("path/to/your")   

  • 3 kudos
2 More Replies
NOOR_BASHASHAIK
by Contributor
  • 2176 Views
  • 2 replies
  • 1 kudos

Machine Type for VACUUM operation

Dear allI have a workflow with 2 tasks : one that does OPTIMIZE, followed by one that does VACUUM. I used a cluster with F32s driver and F64s - 8 workers (auto-scaling enabled). All 8 workers are launched by Databricks as soon as OPTIMIZE starts. As ...

NOOR_BASHASHAIK_0-1710268182562.png
Data Engineering
best practice
F series
optimize
vacuum
  • 2176 Views
  • 2 replies
  • 1 kudos
Latest Reply
ArturOA
New Contributor III
  • 1 kudos

Hi,were you able to get any useful help on this?

  • 1 kudos
1 More Replies

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels