Hi everyone.
I am trying to learn the keyword OPTIMIZE from this blog using scala: https://docs.databricks.com/delta/optimizations/optimization-examples.html#delta-lake-on-databricks-....
But my local spark seems not able to parse the OPTIMIZE and gave me followings:
scala> spark.sql("OPTIMIZE flights ZORDER BY (DayofWeek)")
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'OPTIMIZE' expecting {'(', 'SELECT', 'FROM', 'ADD', 'DESC', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD'}(line 1, pos 0)
== SQL ==
OPTIMIZE flights ZORDER BY (DayofWeek)
^^^
at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:241)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:117)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
... 59 elided
My configuration are following:
$ spark-shell --version
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/_,_/_/ /_/_\ version 2.4.4
/_/
Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_232
Branch
Compiled by user on 2019-08-27T21:21:38Z
Revision
Url
Type --help for more information.
I trigge my spark-shell using followings to use delta:
spark-shell --packages io.delta:delta-core_2.11:0.5.0
And I run this scala script (only different at the schema and path): https://gist.github.com/cakebytheoceanLuo/7056eb64907ac1263e46ce3e1afab852
The error happens at this line:
spark.sql("OPTIMIZE flights ZORDER BY (DayofWeek)")
Everything before this error is fine.
Thanks!
Marry Xmas!