cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

Phani1
by Valued Contributor II
  • 1151 Views
  • 0 replies
  • 0 kudos

Databricks cell-level code parallel execution through the Python threading library

Hi Team,We are currently planning to  implement Databricks cell-level code parallel execution through the Python threading library. We are interested in comprehending the resource consumption and allocation process from the cluster. Are there any pot...

  • 1151 Views
  • 0 replies
  • 0 kudos
jitesh
by New Contributor
  • 935 Views
  • 0 replies
  • 0 kudos

Code reusability for silver table transformations

How/how many databricks notebooks should be created to populate multiple silver delta tables, all having different and complex transformations ? What's the best practice -1. create a notebook each for a silver table ?2. push SQL transformation logic ...

  • 935 Views
  • 0 replies
  • 0 kudos
Ruby8376
by Valued Contributor
  • 1331 Views
  • 1 replies
  • 0 kudos

Databricks sql warehouse has Serverless compute as a public preview.

There is a risk form infosec as it is processed in the control plane shared with other azure clients. s there any control to mitigate the risk?

  • 1331 Views
  • 1 replies
  • 0 kudos
Latest Reply
PL_db
Databricks Employee
  • 0 kudos

You can find more information on that topic here. "With Databricks, your serverless workloads are protected by multiple layers of security. These security layers form the foundation of Databricks’ commitment to providing a secure and reliable environ...

  • 0 kudos
astrobil
by New Contributor II
  • 1002 Views
  • 1 replies
  • 0 kudos

Tab Stops Indenting in SQL Editor

I am utilizing Databricks via Azure, and I've been consistently experiencing an issue with the SQL Editor. The tab button, instead of indenting, redirects my cursor to seemingly random parts of the page. This problem has persisted since I began using...

  • 1002 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

which DBR version are you using? which web browser are you using?

  • 0 kudos
kartikmnc
by New Contributor
  • 1119 Views
  • 1 replies
  • 1 kudos

Regarding Exam got Suspended at middle without any reason.

Hi Team,My Databricks Certified Data Engineer Associate exam got suspended on 17th December and it is in progress state.I was there continuously in front of the camera and suddenly the alert appeared, and support person asked me to show the desk and ...

  • 1119 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Adding @Retired_mod for visibility on this request

  • 1 kudos
Hubert-Dudek
by Esteemed Contributor III
  • 986 Views
  • 1 replies
  • 1 kudos

How much USD are you spending on Databricks?

Join two system tables and get exactly how much USD you are spending.The short version of the query: SELECT u.usage_date, u.sku_name, SUM(u.usage_quantity * p.pricing.default) AS total_spent, p.currency_code FROM system.billing....

system_pig.png
  • 986 Views
  • 1 replies
  • 1 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 1 kudos

Thank you for sharing this information @Hubert-Dudek 

  • 1 kudos
Fresher
by New Contributor II
  • 857 Views
  • 1 replies
  • 0 kudos

Query is taking too long to run

I have two clusters. Cluster A(spark cluster) and cluster B(SQL warehouse). whenever I try to run a particular query using cluster B, it works fine but whenever I try to run same query using cluster A. It's taking time and never show the output

  • 857 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Check the physical query plan of the query you are running. Also, check the Spark UI to identify where is taking time and why.

  • 0 kudos
shanebo425
by New Contributor III
  • 1129 Views
  • 1 replies
  • 0 kudos

Databricks OutOfMemory error on code that previously worked without issue

I have a notebook in Azure Databricks that does some transformations on a bronze tier table and inserts the transformed data into a silver tier table. This notebook is used to do an initial load of the data from our existing system into our new datal...

  • 1129 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

Please review your Spark UI from the old job execution versus the new job execution. You might need to check if the data volume has increase and that could be the reason of the OOM

  • 0 kudos
PrashantAghara
by New Contributor II
  • 1243 Views
  • 1 replies
  • 0 kudos

org.apache.spark.SparkException: Job aborted due to stage failure when writing to Cosmos

I am writing data to cosmos DB using Python & Spark on DatabricksI am getting below error :org.apache.spark.SparkException: Job aborted due to stage failure: Authorized committer (attemptNumber=0, stage=192, partition=105) failed; but task commit suc...

  • 1243 Views
  • 1 replies
  • 0 kudos
Latest Reply
PrashantAghara
New Contributor II
  • 0 kudos

The configs are for cluster:Worker Type & Driver type : Standard_D16ads_v5RUs for Cosmos : 1.5L

  • 0 kudos
DC3
by New Contributor II
  • 2247 Views
  • 2 replies
  • 0 kudos

Unable to access unity catalog volume via /Volumes in notebook

I have set up a volume in unity catalog in the format catalog/schema/volume, and granted all permissions to all users on the catalog, schema and volume.From the notebook I can see the /Volumes directory in the root of the file system but am unable to...

  • 2247 Views
  • 2 replies
  • 0 kudos
Latest Reply
DC3
New Contributor II
  • 0 kudos

Thanks for your comments. The problem turned out to be the compute resource not having unity catalog enabled.

  • 0 kudos
1 More Replies
Sagas
by New Contributor II
  • 987 Views
  • 1 replies
  • 0 kudos

SparkR or sparklyr not showing history

Hi,for some reason Azure Databricks doesn't show History if the data is saved with SparkR (2 in the figure below) or Sparklyr (3), but it does show it with Data Ingestion (0) or with PySpark (1). Is this a known bug or am I doing something wrong? Is ...

Databricks_history.PNG SparkR.PNG Sparklyr.PNG
Data Engineering
sparklyr
SparkR
  • 987 Views
  • 1 replies
  • 0 kudos
patrickw
by New Contributor II
  • 7756 Views
  • 2 replies
  • 0 kudos

connect timed out error - Connecting to SQL Server from Databricks

I am getting a connect timed out error when attempting to access a sql server. I can successfully ping the server from Databricks. I have used the jdbc connection and the sqlserver included driver and both result in the same error. I have also attemp...

  • 7756 Views
  • 2 replies
  • 0 kudos
Latest Reply
Walter_C
Databricks Employee
  • 0 kudos

Can you run the following command in a notebook using the same cluster you are using to connect:%sh nc -vz <hostname> <port> This test will confirm us if we are able to communicate with the SQL server by using the port you are defining to connect. If...

  • 0 kudos
1 More Replies
NOOR_BASHASHAIK
by Contributor
  • 2527 Views
  • 2 replies
  • 1 kudos

Machine Type for VACUUM operation

Dear allI have a workflow with 2 tasks : one that does OPTIMIZE, followed by one that does VACUUM. I used a cluster with F32s driver and F64s - 8 workers (auto-scaling enabled). All 8 workers are launched by Databricks as soon as OPTIMIZE starts. As ...

NOOR_BASHASHAIK_0-1710268182562.png
Data Engineering
best practice
F series
optimize
vacuum
  • 2527 Views
  • 2 replies
  • 1 kudos
Latest Reply
ArturOA
New Contributor III
  • 1 kudos

Hi,were you able to get any useful help on this?

  • 1 kudos
1 More Replies
PrebenOlsen
by New Contributor III
  • 1336 Views
  • 2 replies
  • 0 kudos

How to migrate Git repos with DLT configurations

Hi!I want to migrate all my databricks related code from one github repo to another. I knew this wouldn't be straight forward. When I copy my code for one DLT, I get the errororg.apache.spark.sql.catalyst.ExtendedAnalysisException: Table 'vessel_batt...

  • 1336 Views
  • 2 replies
  • 0 kudos
Latest Reply
PrebenOlsen
New Contributor III
  • 0 kudos

Does cloning take considerably less time then recreating the tables?Can I resume append operations to a cloned table?

  • 0 kudos
1 More Replies

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now
Labels