Based on the information available in the provided context, it seems that there are several considerations for ensuring data is populated in the system.storage.predictive_optimization_operations_history
table after enabling Predictive Optimization. Here are key points and troubleshooting suggestions:
-
Enabling Predictive Optimization System Table:
- To access the
system.storage.predictive_optimization_operations_history
table, you must ensure that the storage schema is enabled for your system catalog. In cases where logs are not populated in the table, this could be an indicator that the storage schema has not been correctly enabled. Ensure you're following these steps:
- Use the API or CLI to enable the schema (e.g., with a
PUT
request to the relevant API endpoint for enabling Unity Catalog system schemas).
- Validate this step was successful by attempting to query the table or listing accessible system tables in your catalog.
-
Prerequisites and Regional Support:
- Predictive Optimization depends on specific regional support and minimum requirements. Check whether your workspace and metastore fall under regions that support this feature. According to the provided documents, only Unity Catalog Managed Tables are currently supported, and the DBR (Databricks Runtime) should ideally be of version 12.2 or above. Ensure that these conditions are met.
-
Context Around Data Population Delays:
- The documentation notes that data in the
system.storage.predictive_optimization_operations_history
table can take up to 24 hours to populate after starting operations. As you mentioned itโs been over 24 hours without logs showing up, it's worth verifying:
- Whether any operations (e.g., OPTIMIZE or VACUUM) were triggered by Predictive Optimization on eligible tables.
- Whether tables under the configured catalog or schema meet the eligibility criteria (for example, they should not be external tables and must belong to a catalog or schema for which Predictive Optimization is enabled).
-
Observability into Operations:
- Predictive Optimization prioritizes tables with high return on investment, which means if a catalog or schema has no eligible or high-priority tables during the time window, operations may not yet have been run.
- To check for successful configurations and validate Predictive Optimizationโs actions, you can run:
DESCRIBE (CATALOG | SCHEMA | TABLE) EXTENDED name;
This should provide details, including whether Predictive Optimization is enabled for the catalog, schema, or table.
-
Logs or System Issues:
- Ensure that there are no internal system issues by confirming the Predictive Optimization service is active and properly configured. If no operations are logged after verifying all prerequisites and configuration, this could indicate an issue that requires escalation to Databricks support.
-
Additional Debugging with Example Queries:
- Once the
system.storage.predictive_optimization_operations_history
table starts populating data, you can execute queries to analyze operations. For example:
- Total operations performed in a given time:
sql
SELECT COUNT(DISTINCT operation_id)
FROM system.storage.predictive_optimization_operations_history;
- Verify operations performed by table and type:
sql
SELECT
metastore_name,
catalog_name,
schema_name,
table_name,
operation_type,
operation_status
FROM system.storage.predictive_optimization_operations_history;
To summarize, validate the above prerequisites (e.g., storage schema is enabled, the feature is properly configured on eligible tables, and operations have had time to execute). If all these checks pass and the table remains empty beyond the 24-hour wait period, consider reaching out to Databricks technical support for further assistance.
Hope this help.
Louis