cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Schema issues with External Tables

AnaLippross
New Contributor

Hi everyone!

We have started using Unity Catalog in our Project and I am seeing weird behavior with the schemas from external tables imported to Databricks. On Data Explorer when I expand some tables I see that the schema of those specific tables is wrong, compared to the original on SQL Server (wrong column names and types). When I try to run a complex query using these tables it fails with the message "The user-specified schema doesn't match the actual schema" but we did not define the schema manually, rather we let Databricks import it from the source tables. Once I run a simple "SELECT * FROM Table" or "show create table" for all the tables involved in the query and run the query again (all within notebooks) it works fine. The schemas get updated and I have no problems. The issue clearly is intermittent because if I detach the cluster I used for the query and attach it again and try to run the query directly it does not work again, I have to run individual SELECT * for all the tables to make it work again.

Has anyone encountered this issue before? Or is it a known issue still being fixed? I appreciate Unity Catalog has just been released fully so I'm assuming there might be some bugs out there.

Thanks a lot!

 

1 REPLY 1

youssefmrini
Honored Contributor III
Honored Contributor III

 

It seems like you are encountering an issue with the schema mapping when importing external tables to Unity Catalog in Databricks.

To troubleshoot thisBased on the information you've provided, it sounds like the issue you're experiencing could be related to the way Databricks is interpreting the schema of external tables imported from SQL Server.

When importing an external table into Databricks through the Unity Catalog, Databricks will attempt to automatically infer the schema of the table based on the underlying metadata. However, there may be cases where the metadata is incomplete, outdated, or otherwise inconsistent with the actual table schema, leading to issues with queries.

To resolve this issue, you may need to manually specify the schema of the external tables in Databricks when creating them, rather than relying on Databricks to infer it from the metadata. You can do this by specifying the schema as a list of StructFields when defining the table using the CREATE TABLE statement.

Here's an example of how you can define an external table with a specific schema in Databricks:

 

sql
CREATE TABLE my_external_table
USING <external_table_type>
OPTIONS (<external_table_options>)
LOCATION '<external_table_location>'
COMMENT '<table_description>'
TBLPROPERTIES ( '<table_properties>' )
AS SELECT *
FROM my_sql_server_table
WHERE 1 = 0-- This will create an empty table with the designated schema

ALTER TABLE my_external_table
ADD COLUMNS (
  col1 STRING,
  col2 INT,
  col3 DOUBLE
);

In this example, you can define the external table by specifying the external table type, options, location, and table properties. You can then use the AS SELECT statement to create an empty table in Databricks based on the schema of your SQL Server table and use ALTER TABLE to add the column names and data types that are relevant to the Databricks query. This method should ensure that the schema of the external table in Databricks is consistent with the actual table schema in SQL Server.

If the issue persists, it may be helpful to review the Databricks logs to see if there are any error messages or inconsistencies related to the schema inference process. You can also reach out to the Databricks support team for further assistance with troubleshooting this issue.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.