08-29-2019 11:47 AM
I am using databricks to transform the data and than pushing the data into datalake.
the data is getting pushed in if the length of the string field is 255 or less but it throws following error if it is beyond that.
"SQL DW failed to execute the JDBC query produced by the connector.
Underlying SQLException(s): - com.microsoft.sqlserver.jdbc.SQLServerException: HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopSqlException: String or binary data would be truncated. [ErrorCode = 107090] [SQLState = S0001"
I am using following code to push the data into Datawarehouse
testdf.write
.format("com.databricks.spark.sqldw") .
option("url", sqlDwUrlSmall) .
option("dbtable", "dbo.testAddress") .
option( "forward_spark_azure_storage_credentials","True")
.option("tempdir", tempDir)
.mode("overwrite")
.save()
This table has just one column and the length of this field is 4000.
09-03-2019 08:42 PM
@bhaumikg I think this link here explains whats going on:https://kb.informatica.com/solution/23/pages/68/563577.aspx The hive metastore column PARAM_VALUE in table TABLE_PARAMS probably has datatype that is 4000 in length. Thus when you attempt to load a record greater than that you get: "String or binary data would be truncated".
09-04-2019 11:31 AM
I'm having a similar issue in Azure with MS-SQL Server Datawarehouse (DWH)
Eventhough I have set the target column to nvarchar(4000) on MS SQL Server DWH
This is the code:
spark.conf.set(
"spark.sql.parquet.writeLegacyFormat", "true")
dwDF.write.format("com.databricks.spark.sqldw")
.option("url", sqlDwUrlSmall)
.option("dbtable", TableName)
.option( "forward_spark_azure_storage_credentials","True")
.option("tempdir", tempDir)
.mode("overwrite")
.save()
I get the following error.
-
com.microsoft.sqlserver.jdbc.SQLServerException:
HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling
record reader buffer: HadoopSqlException: String or binary data would be
truncated. [ErrorCode = 107090] [SQLState = S0001]
09-27-2019 03:52 AM
I have the same issue regarding this case.
CREATE TABLE [dbo].[SampleTable14](
[firstname] [varchar](8000) NULL,
[lastname] [nvarchar](300) NULL,
[gender] [nvarchar](300) NULL,
[location] [nvarchar](300) NULL,
[subscription_type] [nvarchar](300) NULL
)
WITH
(
DISTRIBUTION = ROUND_ROBIN,
CLUSTERED COLUMNSTORE INDEX
)
GO
I also tried by maximizing the table length to 1000 and data length to 8000.
04-02-2020 06:42 AM
You have to set the maxStrLength option to a value bigger than the longest string in your source data
synapseDF.write \
.format("com.databricks.spark.sqldw") \
.option("url", connStr ) \
.mode( "append" ) \
.option("tempDir", synapse_tempDir ) \
.option("forwardSparkAzureStorageCredentials", "true") \
.option("maxStrLength", "1024" ) \
.option("dbTable", synapse_targetschema + "." + synapse_targettable ) \
.save()
Synapse creates a temporary external table while loading data with Polybase, so even if you create your target table with columns of the appropriate width, you can still get truncation errors from this temp table if you don't set maxStrLength
06-26-2020 05:16 PM
I tried this still I have same error. My string length is 1900, i gave maxStrLength value as 3000 .. still not working.
04-22-2020 10:12 AM
Hello guys, I'm facing this error, it is not exactly the same as refered above, but as it is about the same code, i though you might help.
thanks in advance 🙂 I'm following the tutorial on this link:https://docs.microsoft.com/en-us/azure/azure-databricks/databricks-extract-load-sql-data-warehousecom.databricks.spark.sqldw.SqlDWSideException: SQL DW failed to execute the JDBC query produced by the connector.
Underlying SQLException(s):
- com.microsoft.sqlserver.jdbc.SQLServerException: External file access failed due to internal error: 'Error occurred while accessing HDFS: Java exception raised on call to HdfsBridge_IsDirExist. Java exception message:
HdfsBridge::isDirExist - Unexpected error encountered checking whether directory exists or not: StorageException: This request is not authorized to perform this operation.' [ErrorCode = 105019] [SQLState = S0001]
04-24-2020 09:23 AM
As suggested by ZAIvR, please use append and provide maxlength while pushing the data. Overwrite may not work with this unless databricks team has fixed the issue
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group