03-21-2023 04:35 PM
If I creat a table using the code below: CREATE TABLE IF NOT EXISTS jdbcTable
using org.apache.spark.sql.jdbc
options(
url "sql_server_url",
dbtable "sqlserverTable",
user "username",
password "password"
)
will jdbcTable always be automatically synchronized with sqlserverTable? Thanks!
03-22-2023 01:30 AM
Hi @andrew li There is a feature introduced from DBR11 where you can directly ingest the data to the table from a selected list of sources. As you are creating a table, I believe this command will create a managed table by loading the data from the sqlserver table to your default warehouse location. Please do DESCRIBE EXTENDED and check the path to see if you have data in there. If there is data, it is not going to sync automatically.
Can you try creating a View with the same way and see what happens there?
Please refer the below link
https://docs.databricks.com/external-data/jdbc.html
AFAIK, DBSQL and Delta lake supports external table on S3 layer like hive external table. The table automatically pickups the data when loaded in S3 layer.
03-22-2023 01:30 AM
Hi @andrew li There is a feature introduced from DBR11 where you can directly ingest the data to the table from a selected list of sources. As you are creating a table, I believe this command will create a managed table by loading the data from the sqlserver table to your default warehouse location. Please do DESCRIBE EXTENDED and check the path to see if you have data in there. If there is data, it is not going to sync automatically.
Can you try creating a View with the same way and see what happens there?
Please refer the below link
https://docs.databricks.com/external-data/jdbc.html
AFAIK, DBSQL and Delta lake supports external table on S3 layer like hive external table. The table automatically pickups the data when loaded in S3 layer.
03-22-2023 07:58 AM
yes, I thought the internal table stored at hive warehouse will not get updated automatically. But to my surprise, the table was synchronized immediately after I manually updated the source table in azure Sql server database.
03-22-2023 08:11 AM
@andrew li That's interesting. I'm curious to try this out and get an answer on how does the Databricks layer know that the source is updated? As it is pull based ingestion pattern, the trigger should be from DBx.
Excited to expand your horizons with us? Click here to Register and begin your journey to success!
Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!