cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Offline Feature Store in Databricks Serving

AlexH
New Contributor II

Hi, 

I am planning to deploy a model (pyfunc)  with Databricks Serving. During inference, my model needs to retrieve some data from delta tables. I could make these tables to an offline feature store as well.

Latency is not so important. It doesnt matter too much if it is not milliseconds retrieval latency but in the seconds so I dont really want to go with an online feature store as this will not be worth the extra cost.

What would be my best options to get data from the delta tables?

Thanks!

2 REPLIES 2

Hubert-Dudek
Esteemed Contributor III

There is a ready feature engineering function for that: 

# on non ML runtime please install databricks-feature-engineering>=0.13.0a3"

from databricks.feature_engineering import FeatureEngineeringClient
fe = FeatureEngineeringClient()

from databricks.feature_engineering import FeatureLookup

# The `FeatureSpec` can be accessed in Unity Catalog as a function.
# `FeatureSpec`s can be used to create training sets or feature serving endpoints.
fe.create_feature_spec(
  name = f"{CATALOG}.{SCHEMA}.feature_spec",
  features=[
    FeatureLookup(
      table_name=f"{CATALOG}.{SCHEMA}.offline_feature_table",
      lookup_key="id",
    ),
  ],
)

## add serving endpoint (can be done through UI too)
rom databricks.feature_engineering.entities.feature_serving_endpoint import (
  ServedEntity,
  EndpointCoreConfig,
)

fe.create_feature_serving_endpoint(
  name="my-feature-serving-endpoint",
  config=EndpointCoreConfig(
    served_entities=ServedEntity(
      feature_spec_name=f"{CATALOG}.{SCHEMA}.feature_spec",
        workload_size="Small",
        scale_to_zero_enabled=True,
        instance_profile_arn=None,
    )
  )
)

## inference



import mlflow.deployments

client = mlflow.deployments.get_deploy_client("databricks")

response = client.predict(
    endpoint="my-feature-serving-endpoint",
    inputs={
        "dataframe_records": [
            {"id": 1},
            {"id": 7},
            {"id": 12345},
        ]
    },
)
print(response)
     

 

AlexH
New Contributor II

Thanks. That is helpful already. 
Is this working without an online feature store?

In the docs it reads like this is based on online feature stores: https://learn.microsoft.com/en-us/azure/databricks/machine-learning/feature-store/feature-function-s...