Hi @dev_puli, In Azure Databricks, you can trace the history of table modifications, including the user responsible and the timestamp.
Hereโs how you can achieve this:
Delta Lake Table History:
- Each operation that modifies a Delta Lake table creates a new table version.
- You can retrieve information about these operations, including the user ID, timestamp, and operation type, by running the DESCRIBE HISTORY command on your Delta table.
- The history information is returned in reverse chronological order.
- Example SQL commands:
- To get the full history of the table:DESCRIBE HISTORY '/data/events/'
- To get only the last operation:DESCRIBE HISTORY '/data/events/' LIMIT 1
- The table history retention is determined by the setting delta.logRetentionDuration, which is 30 days by default.
- Note that Databricks does not recommend using Delta Lake table history as a long-term backup solution for data archival. Itโs primarily for auditing, rollback, and time travel purposes1.
Comparison and Deployment:
- To compare performance across different environments, consider the following steps:
- Development Environment:
- Develop and test your workflows in a development environment.
- Use the Delta Lake table history to track changes and understand performance.
- Higher Environments (Staging, Production):
- Once satisfied with the development, deploy your workflows to higher environments.
- Ensure that the same Delta Lake table schema and data are used.
- Monitor performance and compare it against the development environment.
- Automated Deployment:
- Use CI/CD pipelines or automation tools to streamline deployment from development to higher environments.
- Automate the process of creating tables, applying schema changes, and loading data.
Workspace Files and Notebooks:
- You can also programmatically access workspace files (including notebooks) using Python or Scala.
- Retrieve details such as creation date, modified date, and user information.
- This can be useful for tracking changes and understanding performance across different versions of notebooks.
Remember to adapt these steps based on your specific use case and requirements.
By leveraging Delta Lake history and workspace file details, you can gain insights into modifications and compare performance effectively. ๐