We are running AutoML Forecast on Databricks Runtime 15.4 ML LTS and 16.4 ML LTS, using a time series dataset with temporal covariates from the Feature Store (e.g. a corona_dummy feature). We use feature_store_lookups with lookup_key and timestamp_lookup_key.
Our feature table is defined like this:
fs.create_table(
name="...features.example_corona_features",
primary_keys=["Monat", "Produkt", "Vertriebstyp_Art"],
df=...,
timestamp_keys="Monat",
...
)
And we call AutoML with:
feature_store_lookups=[{
"table_name": "...features.example_corona_features",
"lookup_key": ["Produkt", "Vertriebstyp_Art"],
"timestamp_lookup_key": "Monat"
}]
✅ Expected:
AutoML performs a temporal join between the dataset and the feature table (via timestamp and keys) and proceeds with training including the covariate corona_dummy.
❌ Actual:
AutoML proceeds with the run, but fails during internal applyInPandas() or .toPandas() conversion, throwing:
ValueError: Length mismatch: Expected axis has 8 elements, new values have 11 elements
This crash occurs after joining features and loading the training set — i.e., during execution of AutoML’s internal training loop.
🔍 Observations:
When we remove the feature_store_lookups, AutoML completes without errors.
The issue appears only when the timestamp column (Monat) is both:
Can you confirm if this is a known issue, and what the correct contract is for using feature_store_lookups with timestamp_lookup_key in AutoML?