cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Issues Creating Genie Space via API Join Specs Are Not Persisted

dikla
Visitor

Hi,
I’m experimenting with the new API to create a Genie Space.
I’m able to successfully create the space, but the join definitions are not created, even though I’m passing a join_specs object in the same format returned by GET /spaces/{id} for an existing space.

Here is the full payload I’m sending (simplified for clarity):

# Tables and Join Keys
CUSTOMER_TABLE = "mycatalog.customer.data"
ORDERS_TABLE = "mycatalog.shipbob.orders"
JOIN_KEY = "CUSTOMER_ID"
CUSTOMER_ALIAS = "data"
ORDERS_ALIAS = "orders"

MINIMAL_GENIE_CONFIG = {
    "version": 1,
    "data_sources": {
        "tables": [
            {
                "identifier": CUSTOMER_TABLE,
                "column_configs": [
                    {"column_name": "CUSTOMER_ID", "get_example_values": True},
                    {"column_name": "FIRST_NAME", "get_example_values": True, "build_value_dictionary": True},
                    {"column_name": "LOYALTY_STATUS", "get_example_values": True, "build_value_dictionary": True},
                ]
            },
            {
                "identifier": ORDERS_TABLE,
                "column_configs": [
                    {"column_name": "CUSTOMER_ID", "get_example_values": True},
                    {"column_name": "ORDER_DATE", "get_example_values": True},
                    {"column_name": "ORDER_ID", "get_example_values": True},
                ]
            }
        ],
        "join_specs": [
            {
                "id": uuid.uuid4().hex[:32],
                "left": {"identifier": ORDERS_TABLE, "alias": ORDERS_ALIAS},
                "right": {"identifier": CUSTOMER_TABLE, "alias": CUSTOMER_ALIAS},
                "sql": [
                    f"`{ORDERS_ALIAS}`.`{JOIN_KEY}` = `{CUSTOMER_ALIAS}`.`{JOIN_KEY}`",
                    "--rt=FROM_RELATIONSHIP_TYPE_MANY_TO_ONE--"
                ]
            }
        ],
        "sql_snippets": {
            "measures": [
                {
                    "id": uuid.uuid4().hex[:32],
                    "sql": [f"COUNT(DISTINCT {ORDERS_ALIAS}.ORDER_ID)\n"],
                    "display_name": "ORDERS.TOTAL_ORDER_COUNT",
                    "instruction": ["Calculates the total count of unique orders.\n"]
                }
            ]
        }
    },
    "instructions": {
        "prompt": [
            f"You are a test assistant. You can join {ORDERS_TABLE} and {CUSTOMER_TABLE} on {JOIN_KEY}.\n"
        ],
        "example_question_sqls": []
    },
    "title": NEW_SPACE_TITLE,
    "description": "Minimal configuration using confirmed working tables and join structure."
}

serialized_space_str = json.dumps(MINIMAL_GENIE_CONFIG)

payload = {
    "warehouse_id": warehouse_id,
    "serialized_space": serialized_space_str,
    "title": genie_space_config["title"],
    "description": genie_space_config["description"],
}

The space is created, but the join defined in join_specs doesn’t appear in the resulting space.
Only the tables are created.

Question:
Is there any documentation describing the expected structure or constraints for creating:

  • join_specs

  • common SQL expressions

  • measures
    directly through the API?

I’ve only found the high-level schema in GET /spaces/{id}, but not how to correctly POST join definitions so they persist.

Any guidance or examples would be greatly appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions

Raman_Unifeye
Contributor III

@dikla - The Databricks Genie API currently only documents the high‑level schema for spaces (via GET /spaces/{id}), but the join_specs, sql_snippets, and measures objects are not fully supported for persistence when passed directly in POST /spaces. That’s why your join_specs disappear — the API ignores them unless they conform to the internal schema used by Genie’s space builder.

Current Limitiations: 

  • Joins: Must be inferred by Genie or added manually in the UI; API does not persist them.

  • Measures / SQL snippets: Same limitation — they can be read via GET /spaces/{id} but not reliably written via POST.

 


RG #Driving Business Outcomes with Data Intelligence

View solution in original post

3 REPLIES 3

Raman_Unifeye
Contributor III

@dikla - The Databricks Genie API currently only documents the high‑level schema for spaces (via GET /spaces/{id}), but the join_specs, sql_snippets, and measures objects are not fully supported for persistence when passed directly in POST /spaces. That’s why your join_specs disappear — the API ignores them unless they conform to the internal schema used by Genie’s space builder.

Current Limitiations: 

  • Joins: Must be inferred by Genie or added manually in the UI; API does not persist them.

  • Measures / SQL snippets: Same limitation — they can be read via GET /spaces/{id} but not reliably written via POST.

 


RG #Driving Business Outcomes with Data Intelligence

dikla
Visitor

@Raman_Unifeye

@Raman_Unifeye Thanks for the detailed explanation — that really helps clarify why my join specs weren’t being persisted.

Do you know if support for persisting join_specs, sql_snippets, and measures via the API is planned for an upcoming release?

No committed timeline has been published or I came across at least


RG #Driving Business Outcomes with Data Intelligence