cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

cgrant
by Databricks Employee
  • 12176 Views
  • 2 replies
  • 3 kudos

What is the difference between OPTIMIZE and Auto Optimize?

I see that Delta Lake has an OPTIMIZE command and also table properties for Auto Optimize. What are the differences between these and when should I use one over the other?

  • 12176 Views
  • 2 replies
  • 3 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 3 kudos

From my Data+AI talk on Operating and Supporting Delta lake in production

  • 3 kudos
1 More Replies
brickster_2018
by Databricks Employee
  • 3801 Views
  • 1 replies
  • 0 kudos

Resolved! Unable to overwrite the schema of a Delta table

As per the docs, I can overwrite the schema of a Delta table using the "overWriteSchema" option. But i am unable to overwrite the schema for a Delta table.

  • 3801 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

When Table ACLs are enabled, we can't change the schema of an operation through a write, which requires * MODIFY permissions, when schema changes require OWN permissions. Hence overwriting schema is not supported when Table ACL is enabled for the D...

  • 0 kudos
brickster_2018
by Databricks Employee
  • 6221 Views
  • 1 replies
  • 0 kudos
  • 6221 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

The below code can be used to get the number of records in a Delta table without querying it%scala import com.databricks.sql.transaction.tahoe.DeltaLog import org.apache.hadoop.fs.Path import org.apache.spark.sql.DataFrame import org.apache.spark.sql...

  • 0 kudos
brickster_2018
by Databricks Employee
  • 1825 Views
  • 1 replies
  • 1 kudos

Resolved! Cluster logs missing

On the Databricks cluster UI, when I click on the Driver logs, sometimes I see historic logs and sometimes I see logs for the last few hours. Why do we see this inconsistency

  • 1825 Views
  • 1 replies
  • 1 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 1 kudos

This is working per design! This is the expected behavior. When the cluster is in terminated state, the logs are serviced by the Spark History server hosted on the Databricks control plane. When the cluster is up and running the logs are serviced by ...

  • 1 kudos
User16790091296
by Contributor II
  • 2536 Views
  • 2 replies
  • 1 kudos

Database within a Database in Databricks

Is it possible to have a folder or database with a database in Azure Databricks? I know you can use the "create database if not exists xxx" to get a database, but I want to have folders within that database where I can put tables.

  • 2536 Views
  • 2 replies
  • 1 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 1 kudos

The default location of a database will be in the /user/hive/warehouse/<databasename.db>. Irrespective of the location of the database the tables in the database can have different locations and they can be specified at the time of creation. Databas...

  • 1 kudos
1 More Replies
User16790091296
by Contributor II
  • 961 Views
  • 1 replies
  • 0 kudos

How do we get logs on read queries from delta lake in Databricks?

I've tried with :df.write.mode("overwrite").format("com.databricks.spark.csv").option("header","true").csv(dstPath)anddf.write.format("csv").mode("overwrite").save(dstPath)but now I have 10 csv files but I need one file and name it.

  • 961 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 0 kudos

The header question seems different than your body question. I am assuming that you are asking how to only get a single CSV file when writing? To do so you should use the coalesce:df.coalesce(1).write.format("csv").mode("overwrite").save(dstPath)This...

  • 0 kudos
brickster_2018
by Databricks Employee
  • 2090 Views
  • 1 replies
  • 0 kudos

Resolved! Is it recommended to turn on Spark speculative execution permanently

I had a job where the last step will get stuck forever. Turning on spark speculative execution did magic and resolved the issue. Is it safe to turn on Spark speculative execution permanently.

  • 2090 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

It's not recommended to turn of Spark speculative execution permanently. For jobs where tasks are running slow or stuck because of transient network or storage issues, speculative execution can be very handy. However, it suppresses the actual problem...

  • 0 kudos
User16790091296
by Contributor II
  • 2089 Views
  • 1 replies
  • 0 kudos

If possible, how can I update R Version on Azure Databricks?

Azure Databricks currently runs R version 3.4.4 (2018-03-15), which is unacceptable in my opinion since the latest R version on CRAN is 3.5.2 (2018-12-20).My question is: Is it possible for me to upgrade and install R version 3.5.2 on Azure Databrick...

  • 2089 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16752239289
Databricks Employee
  • 0 kudos

You can change the R version by following this document we have https://docs.microsoft.com/en-us/azure/databricks/kb/r/change-r-versionThe R version comes with each DBR (Databricks Runtime) can be find in the release note https://docs.microsoft.com/e...

  • 0 kudos
brickster_2018
by Databricks Employee
  • 1469 Views
  • 1 replies
  • 0 kudos
  • 1469 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

It's the use case that decides the use of Shallow clone or DEEP Clone Data is physically copied to the clone table in the case of a Deep clone. A deep clone is very useful to copy the data and have a backup of the data in another region/env. The typ...

  • 0 kudos
User16790091296
by Contributor II
  • 1056 Views
  • 0 replies
  • 0 kudos

SAS Files in Databricks (Stack Overflow)

I am trying to convert SAS files to CSV in Azure Databricks. SAS files are in Azure Blob. I am successfully able to mount the azure blob in Databricks, but when I read from it, it has no files even though there are files on Blob. Has anyone done this...

  • 1056 Views
  • 0 replies
  • 0 kudos
User16790091296
by Contributor II
  • 1133 Views
  • 1 replies
  • 1 kudos
  • 1133 Views
  • 1 replies
  • 1 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 1 kudos

The open source spark connector for Snowflake is available by default in the Databricks runtime. To connect you can use the following code: # Use secrets DBUtil to get Snowflake credentials. user = dbutils.secrets.get("<scope>", "<secret key>") passw...

  • 1 kudos
User16790091296
by Contributor II
  • 1528 Views
  • 1 replies
  • 0 kudos

How to Prevent Duplicate Entries to enter to delta lake of Azure Storage?

I Have a Dataframe stored in the format of delta into Adls, now when im trying to append new updated rows to that delta lake it should, Is there any way where i can delete the old existing record in delta and add the new updated Record.There is a uni...

  • 1528 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 0 kudos

To achieve this you should use a merge command that will update rows that are existing with the unique ID. This will update the rows that already exist and insert the rows that do not. If you want to do it manually, you could delete rows using the DE...

  • 0 kudos
User16790091296
by Contributor II
  • 929 Views
  • 0 replies
  • 1 kudos

How to get access to Databricks SQL analytics?

am trying to do this tutorial about databricks sql analytics (https://docs.microsoft.com/en-us/azure/databricks/sql/get-started/admin-quickstart) but when i create my databricks workspace i do not have the icon at the bottom of the sidebar to access ...

  • 929 Views
  • 0 replies
  • 1 kudos

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group
Labels