cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Error while writing Data Frame from Azure Databricks to Azure synapse Dedicated Pool

rahul08yadav
New Contributor III

I am reading delta tables from my Databricks workspace as DF and then I am trying to write this DF into Azure Synapse Dedicated SQL Pool, but I am getting error like:-

Py4JJavaError: An error occurred while calling o1509.save.
: com.databricks.spark.sqldw.SqlDWSideException: Azure Synapse Analytics failed to execute the JDBC query produced by the connector.
Underlying SQLException(s):
  - com.microsoft.sqlserver.jdbc.SQLServerException: The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or an illegal character among the padding characters.  [ErrorCode = 110813] [SQLState = S0001]

Below is my code:-

from pyspark.sql import *
from pyspark.sql.types import *
from pyspark.sql.functions import *
 
dwDatabase = "sqldbpool"
dwServer = "synapseworkspace.database.windows.net"
dwUser = "user"
dwPass = "pass"
dwJdbcPort =  "1433"
dwJdbcExtraOptions = "encrypt=true;trustServerCertificate=true;hostNameInCertificate=*.database.windows.net;loginTimeout=30;"
sqlDwUrl = "jdbc:sqlserver://" + dwServer + ":" + dwJdbcPort + ";database=" + dwDatabase + ";user=" + dwUser+";password=" + dwPass + ";$dwJdbcExtraOptions"
sqlDwUrlSmall = "jdbc:sqlserver://" + dwServer + ":" + dwJdbcPort + ";database=" + dwDatabase + ";user=" + dwUser+";password=" + dwPass
 
spark.conf.set(
    "spark.sql.parquet.writeLegacyFormat",
    "true")
 
df = StarSchemaDict['Product_Dim'] # The value of the dictionary is a DF.
 
(df
 .write
 .format("com.databricks.spark.sqldw")
 .option("url", sqlDwUrlSmall)
 .option("dbtable", "Product_Dim")
 .option( "forward_spark_azure_storage_credentials","True")
 .option("tempdir", adls)
 .mode("overwrite")
 .save())

The data is getting copied to my adls staging area but after that it is throwing an error.

2 REPLIES 2

Debayan
Esteemed Contributor III
Esteemed Contributor III

Hi, If you have encoded any section in the code, could you please decode > re-encode and try again?

rahul08yadav
New Contributor III

The issue was getting created because I was passing SAS Token for setting the configuration for Storage Account instead of Storage Access Key. It's working now without any issue..

spark.conf.set("fs.azure.account.key.datalakestorage.dfs.core.windows.net",dbutils.secrets.get(scope = "databricks-scope",key = "StorageAccessKey"))

I don't know why the error shows " The input is not a valid Base-64 string as it contains a non-base 64 character" but the issue was related to Storage Access Key.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.