I am reading delta tables from my Databricks workspace as DF and then I am trying to write this DF into Azure Synapse Dedicated SQL Pool, but I am getting error like:-
Py4JJavaError: An error occurred while calling o1509.save.
: com.databricks.spark.sqldw.SqlDWSideException: Azure Synapse Analytics failed to execute the JDBC query produced by the connector.
Underlying SQLException(s):
- com.microsoft.sqlserver.jdbc.SQLServerException: The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or an illegal character among the padding characters. [ErrorCode = 110813] [SQLState = S0001]
Below is my code:-
from pyspark.sql import *
from pyspark.sql.types import *
from pyspark.sql.functions import *
dwDatabase = "sqldbpool"
dwServer = "synapseworkspace.database.windows.net"
dwUser = "user"
dwPass = "pass"
dwJdbcPort = "1433"
dwJdbcExtraOptions = "encrypt=true;trustServerCertificate=true;hostNameInCertificate=*.database.windows.net;loginTimeout=30;"
sqlDwUrl = "jdbc:sqlserver://" + dwServer + ":" + dwJdbcPort + ";database=" + dwDatabase + ";user=" + dwUser+";password=" + dwPass + ";$dwJdbcExtraOptions"
sqlDwUrlSmall = "jdbc:sqlserver://" + dwServer + ":" + dwJdbcPort + ";database=" + dwDatabase + ";user=" + dwUser+";password=" + dwPass
spark.conf.set(
"spark.sql.parquet.writeLegacyFormat",
"true")
df = StarSchemaDict['Product_Dim'] # The value of the dictionary is a DF.
(df
.write
.format("com.databricks.spark.sqldw")
.option("url", sqlDwUrlSmall)
.option("dbtable", "Product_Dim")
.option( "forward_spark_azure_storage_credentials","True")
.option("tempdir", adls)
.mode("overwrite")
.save())
The data is getting copied to my adls staging area but after that it is throwing an error.