OPTIMIZE error: org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'OPTIMIZE'
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-25-2019 04:01 AM
Hi everyone.
I am trying to learn the keyword OPTIMIZE from this blog using scala: https://docs.databricks.com/delta/optimizations/optimization-examples.html#delta-lake-on-databricks-....
But my local spark seems not able to parse the OPTIMIZE and gave me followings:
scala> spark.sql("OPTIMIZE flights ZORDER BY (DayofWeek)")
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'OPTIMIZE' expecting {'(', 'SELECT', 'FROM', 'ADD', 'DESC', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD'}(line 1, pos 0)
== SQL ==
OPTIMIZE flights ZORDER BY (DayofWeek)
^^^
at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:241)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:117)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
... 59 elided
My configuration are following:
$ spark-shell --version
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/_,_/_/ /_/_\ version 2.4.4
/_/
Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_232
Branch
Compiled by user on 2019-08-27T21:21:38Z
Revision
Url
Type --help for more information.
I trigge my spark-shell using followings to use delta:
spark-shell --packages io.delta:delta-core_2.11:0.5.0
And I run this scala script (only different at the schema and path): https://gist.github.com/cakebytheoceanLuo/7056eb64907ac1263e46ce3e1afab852
The error happens at this line:
spark.sql("OPTIMIZE flights ZORDER BY (DayofWeek)")
Everything before this error is fine.
Thanks!
Marry Xmas!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-26-2019 04:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-13-2020 10:05 AM
Hi @Jigao Luo
Did you ever get an answer to this?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-13-2020 02:30 PM
Hi Jigao,
OPTIMIZE isn't in the open source delta API, so won't run on your local Spark instance - https://docs.delta.io/latest/api/scala/io/delta/tables/index.html?search=optimize

