- 453 Views
- 1 replies
- 0 kudos
I am trying to create a Delta table and it seems the Delta table requires additional permissions on the parent folder of the table. The command failed telling permission errors. I tried to create a parquet table and it works fine.
- 453 Views
- 1 replies
- 0 kudos
Latest Reply
Delta table is the non-hive compatible format. So there must also be permissions for a client to access the path to the database’s location so that it can create a new temporary “directory” there. This comes from Spark SQL’s handling of external tabl...
- 918 Views
- 1 replies
- 0 kudos
We are using an internal metastore implementation. ie the metastore is hosted at the Dataricks side. However, we believe the metastore instance made available for my workspace is not adequate enough to handle the load. How can I monitor the number of...
- 918 Views
- 1 replies
- 0 kudos
Latest Reply
Use the below code snippet from a notebook%scala
import java.sql.Connection
import java.sql.DriverManager
import java.sql.ResultSet
import java.sql.SQLException
/**
* For details on what this query means, checkout https://dev.mysql.com/doc/refma...
- 730 Views
- 1 replies
- 1 kudos
My company uses Immuta for data governance. Will Databricks be able to fit into our existing security patterns?
- 730 Views
- 1 replies
- 1 kudos
Latest Reply
Yes, check out the immuta web page on the Databricks Integration. https://www.immuta.com/integrations/databricks
- 2060 Views
- 1 replies
- 0 kudos
Is there a way to get some kind of compute the cost associated with every SQL analytics query?
- 2060 Views
- 1 replies
- 0 kudos
Latest Reply
Right now, we do not have an option to measure the compute cost at a query level.
- 224 Views
- 0 replies
- 0 kudos
Best practices: Cluster configuration | Databricks on AWSLearn best practices when creating and configuring Databricks clusters.https://docs.databricks.com/clusters/cluster-config-best-practices.html
- 224 Views
- 0 replies
- 0 kudos
- 242 Views
- 0 replies
- 0 kudos
Best practices | Databricks on Google CloudLearn best practices when using or administering Databricks.https://docs.gcp.databricks.com/best-practices-index.html
- 242 Views
- 0 replies
- 0 kudos
- 220 Views
- 0 replies
- 0 kudos
Best practices - Azure DatabricksLearn best practices when using or administering Azure Databricks.https://docs.microsoft.com/en-us/azure/databricks/best-practices-index
- 220 Views
- 0 replies
- 0 kudos
- 274 Views
- 0 replies
- 0 kudos
Best practices | Databricks on AWSLearn best practices when using or administering Databricks.https://docs.databricks.com/best-practices-index.html
- 274 Views
- 0 replies
- 0 kudos
- 1886 Views
- 1 replies
- 0 kudos
I am seeing with new commits the old checkpoints are getting removed and i can time travel only last 10 versions , Is there any way I can prevent it so that delat checkpoints are not removed I'm using Azure Databricks 7.3 LTS ML.
- 1886 Views
- 1 replies
- 0 kudos
Latest Reply
If you want to keep your checkpoints X days, you can set delta.checkpointRetentionDuration to X days this way:spark.sql(f"""
ALTER TABLE delta.`path`
SET TBLPROPERTIES (
delta.checkpointRetentionDuration = 'X days'...
- 716 Views
- 1 replies
- 0 kudos
My VACCUM command is stuck. I am not sure if it's deleting any files.
- 716 Views
- 1 replies
- 0 kudos
Latest Reply
There is no direct way to track the progress of the VACUUM command. One easy workaround is to run a DRY RUN from another notebook which will give the estimate of files to be deleted at that point in time. This will give a rough estimate of files to b...
- 1489 Views
- 1 replies
- 0 kudos
I have a directory where I get files with the same multiple times. Will Auto-loader process all the files or will it process the first and ignore the rest
- 1489 Views
- 1 replies
- 0 kudos
Latest Reply
Autoloader has an option - "cloudFiles. allowOverwrites". This determines whether to allow input directory file changes to overwrite existing data. This option is available in Databricks Runtime 7.6 and above.
- 1605 Views
- 1 replies
- 0 kudos
I cannot find how to truncate table using pyspark or python commnd , I need to truncate delta table using python
- 1605 Views
- 1 replies
- 0 kudos
Latest Reply
Not everything is exposed as a function for Python or Java/Scala. Some operations are SQL-only, like spark.sql("TRUNCATE TABLE delta.`<path>`")
- 502 Views
- 0 replies
- 0 kudos
If we want to read from a kms encrypted s3 bucket, but write out unencrypted, Do we use the global init script?I am wondering how to “toggle” btw reading encrypted, and writing unencrypted
- 502 Views
- 0 replies
- 0 kudos
- 523 Views
- 1 replies
- 0 kudos
I have a data frame and I write this data frame to adls table, next day I get an updated data frame which has some records from the past also and i want to update the delta table without creating duplicate
- 523 Views
- 1 replies
- 0 kudos
Latest Reply
This is a task for Merge command - you define condition for merge (your unique column) and then actions.MERGE INTO target
USING src
ON target.column = source.column
WHEN MATCHED THEN
UPDATE SET *
WHEN NOT MATCHED
THEN INSERT *could be your dataf...
- 751 Views
- 1 replies
- 0 kudos
Is it a good practice to run the FSCK REPAIR command on a regular basis? I have Optimize and VACUUM commands scheduled to run every day.
- 751 Views
- 1 replies
- 0 kudos
Latest Reply
Unlike OPTIMIZE and VACUUM, FSCK REPAIR is not an operational command that has to be executed on a regular basis. FSCK REPAIR is useful to repair the Delta metadata and remove the reference of the files from the metadata that are no longer accessible...