online tables to synced table, why is it creating a different service principal everytime?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-04-2025 07:04 AM
Hello!
We started to move our online tables to synced_tables. We just couldnt figure out why it is creating a new service principal everytime we ran the same code we use for online tables?
try:
fe.create_feature_spec(name=feature_spec_name
,features=features
,exclude_columns=exclude_columns)
except Exception as e:
if "already exists" in str(e).lower():
fe.delete_feature_spec(name=feature_spec_name)
print(f"Feature spec {feature_spec_name} already exists. Deleting it...")
fe.create_feature_spec(name=feature_spec_name, features=features, exclude_columns=exclude_columns)
print(f"Created feature spec {feature_spec_name}")
else:
raise eCould we have configured something wrong? #lakebase #synced_tables
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-04-2025 10:14 AM
Greetings @hgm251 , here are some things to consider.
What’s happening and why
- A service principal is auto-created per endpoint at provision time—feature spec creation does not create a service principal; the endpoint creation does. If your automation deletes and re-creates the endpoint on each run, you’ll see a new service principal every time.
-
The FeatureSpec should reference the source Delta table in UC, not the online/synced table. At inference, the endpoint routes lookups through the associated online/synced table for low-latency access automatically.
-
You’ll see audit entries showing the platform granting the minimal permissions that endpoint’s service principal needs to query the online/synced table; that’s expected.
Common pitfall in the snippet
Checks for Lakebase synced tables migration
- Keep the FeatureLookup table_name pointing to the source UC Delta table. Do not point it to the synced table; the endpoint automatically uses the synced/online table for low-latency lookups.
-
Ensure endpoint creator has needed privileges/ownership: USE CATALOG, USE SCHEMA, SELECT on the source table, and ownership requirements when creating the endpoint that serves features from the online/synced table.
-
Expect the endpoint-scoped service principal creation and lifecycle to persist with synced tables; the online tables docs note the endpoint permission model and call out that a unique service principal is created and tied to endpoint lifetime.
Recommended changes
- Don’t delete/recreate the FeatureSpec on “already exists”; just pass. Example from docs:
python try: fe.create_feature_spec(name=feature_spec_name, features=features, exclude_columns=None) except Exception as e: if "already exists" in str(e): pass else: raise e - Reuse the endpoint instead of recreating it. For example, check for an existing endpoint and update its config rather than creating a new one: ```python from databricks.sdk import WorkspaceClient from databricks.sdk.service.serving import EndpointCoreConfigInput, ServedEntityInput
- Confirm your FeatureSpec references the source Delta table and not the synced table. Example patterns in docs show
FeatureLookup(table_name="<catalog>.<schema>.<source_table>", ...)and explicitly note the endpoint uses the online/synced copy automatically.
Bottom line You probably haven’t misconfigured synced tables
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-04-2025 11:31 AM - edited 11-04-2025 11:32 AM
Hi Louis,
I really appreciate the prompt response on this. I think I get it now, but just to give you the full picture, We encountered an error after our create-delete-create endpoint approach. Similar to this sample code from the documentation:
client = mlflow.deployments.get_deploy_client("databricks")
response = client.predict(
endpoint="my-feature-serving-endpoint",
inputs={
"dataframe_records": [
{"id": 1},
{"id": 7},
{"id": 12345},
]
},
)
print(response)the error states it cannot connect to the server instance-ro-<db instance id>.database.azuredatabricks.net, port <PORT>, failed: FATAL: failed to get identity details for username: <created service principal id>
So, from what you stated above is that if we don't recreate the endpoint, we will just have a single service principal until we decide to recreate it, but how do we ensure that this service principal id can connect to the database server? What is the least privilege role we need to give it to?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-05-2025 06:55 AM
@Louis_Frolio it looks like the error happens when there is 7 or more synced lookup tables in a Feature Spec that is being served in the endpoint