cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User16826990884
by New Contributor III
  • 18621 Views
  • 1 replies
  • 1 kudos

Resolved! Views vs Materialized Delta Tables

Is there general guidance around using views vs creating Delta tables? For example, I need to do some filtering and make small tweaks to a few columns for use in another application. Is there a downside of using a view here?

  • 18621 Views
  • 1 replies
  • 1 kudos
Latest Reply
User16826990884
New Contributor III
  • 1 kudos

Views won't duplicate the data so if you are just filtering columns or rows or making small tweaks then views might be a good option. Unless, of course, the filtering is really expensive or you are doing a lot of calculations, then materialize the vi...

  • 1 kudos
User16826994223
by Honored Contributor III
  • 1657 Views
  • 1 replies
  • 0 kudos

MSCK REPAIR TABLE doesn't work in delta

I have a delta table in adls and for the same table, I have defined an external table in hive After creating the hive table and generating manifests, I am loading the partitions using MSCK REPAIR TABLE. All the partition columns are in same But s...

  • 1657 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826994223
Honored Contributor III
  • 0 kudos

Can you please check partition column order, does it in same sequence as before or it has changed

  • 0 kudos
brickster_2018
by Databricks Employee
  • 5884 Views
  • 1 replies
  • 0 kudos

Resolved! How to list all Delta tables in a Database?

I wanted to get a list of all the Delta tables in a Database. What is the easiest way of getting it.

  • 5884 Views
  • 1 replies
  • 0 kudos
Latest Reply
brickster_2018
Databricks Employee
  • 0 kudos

Below code, the snippet can be used to list down the tables in a databaseval db = "database_name"   spark.sessionState.catalog.listTables(db).map(table=>spark.sessionState.catalog.externalCatalog.getTable(table.database.get,table.table)).filter(x=>x....

  • 0 kudos
User16826992666
by Valued Contributor
  • 2577 Views
  • 1 replies
  • 0 kudos

Resolved! Can you implement fine grained access controls on Delta tables?

I would like to provide row and column level security on my tables I have created in my workspace. Is there any way to do this?

  • 2577 Views
  • 1 replies
  • 0 kudos
Latest Reply
Ryan_Chynoweth
Esteemed Contributor
  • 0 kudos

Databricks includes two user functions that allow you to express column- and row-level permissions dynamically in the body of a view definition.current_user(): return the current user name.is_member(): determine if the current user is a member of a s...

  • 0 kudos
User16826992666
by Valued Contributor
  • 2993 Views
  • 1 replies
  • 0 kudos

Resolved! How often should I run OPTIMIZE on my Delta Tables?

I know it's important to periodically run Optimize on my Delta tables, but how often should I be doing this? Am I supposed to do this after every time I load data?

  • 2993 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

It would depend on how frequently you update the table and how often you read it. If you have a daily ETL job updating a delta table, it might make sense to run OPTIMIZE at the end of it so that subsequent reads would benefit from the performance imp...

  • 0 kudos
User16826992666
by Valued Contributor
  • 1625 Views
  • 1 replies
  • 0 kudos

Resolved! Are Delta tables able to support GDPR compliance?

I know that when deletes are made from a Delta table the underlying files are not actually removed. For compliance reasons I need to able to truly delete the records. How can I know which files need to be removed, and is there a way to remove them ot...

  • 1625 Views
  • 1 replies
  • 0 kudos
Latest Reply
sajith_appukutt
Honored Contributor II
  • 0 kudos

Here is a document explaining best practices for GDPR and CCPA compliance using Delta Lake. Specifically on cleaning up stale data - you can use the VACUUM function to remove files that are no longer referenced by a Delta table and are older than a s...

  • 0 kudos
jose_gonzalez
by Databricks Employee
  • 2231 Views
  • 1 replies
  • 0 kudos

Resolved! should I run ANALYZE TABLE on Delta tables?

I would like to know if it recommended to run Analyze table on Delta tables or not. If not, why?

  • 2231 Views
  • 1 replies
  • 0 kudos
Latest Reply
jose_gonzalez
Databricks Employee
  • 0 kudos

You can run  ANALYZE TABLE  on Delta tables only on Databricks Runtime 8.3 and above. For more details please refer to the docs: https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-syntax-aux-analyze-table.html

  • 0 kudos
User16826992666
by Valued Contributor
  • 905 Views
  • 1 replies
  • 0 kudos
  • 905 Views
  • 1 replies
  • 0 kudos
Latest Reply
User16826992666
Valued Contributor
  • 0 kudos

No you do not. Although Delta is the default file format when writing data using Databricks, any file type supported by spark can be used when writing data.

  • 0 kudos
Labels