cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to select from a very large column ( string ) of delta table ?

prasadvaze
Valued Contributor II

In one of my delta table , the string column "abc" has 1753484 characters long value (string) . I get an error while selecting or transforming this column value ( in the downstream application). How do I solve this?

SELECT ID, abc, length(abc) as len

FROM <delta_table_name>

where ID= 28

java.lang.Exception: Results too large at com.databricks.backend.daemon.driver.OutputAggregator$.maybeApplyOutputAggregation(OutputAggregator.scala:303) at com.databricks.backend.daemon.driver.OutputAggregator$.withOutputAggregation0(OutputAggregator.scala:206) at com.databricks.backend.daemon.driver.OutputAggregator$.withOutputAggregation(OutputAggregator.scala:57) at com.databricks.backend.daemon.driver.SQLDriverLocal.executeSql(SQLDriverLocal.scala:114) at com.databricks.backend.daemon.driver.SQLDriverLocal.repl(SQLDriverLocal.scala:143) at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$10(DriverLocal.scala:431) at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:239) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:234) at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:231) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:48) at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:276) at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:269) at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:48) at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:408) at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:653) at scala.util.Try$.apply(Try.scala:213) at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:645) at com.databricks.backend.daemon.driver.DriverWrapper.getCommandOutputAndError(DriverWrapper.scala:486) at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:598) at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:391) at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:337) at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:219) at java.lang.Thread.run(Thread.java:748)

I can do below but that does not solve original issue –

df = spark.sql('''SELECT ID, abc, length(abc) FROM <delta table> where ID= 28 ''')

df.show()

ID | abc | length

------------------------------------------

28 | {"PolicyMessage":...| 1753484

The abc column value shows truncated which I think is the notebook /browser limitation but the actual value is not truncated

0 REPLIES 0

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now