cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

OPTIMIZE error: org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'OPTIMIZE'

JigaoLuo
New Contributor

Hi everyone.

I am trying to learn the keyword OPTIMIZE from this blog using scala: https://docs.databricks.com/delta/optimizations/optimization-examples.html#delta-lake-on-databricks-....

But my local spark seems not able to parse the OPTIMIZE and gave me followings:

scala> spark.sql("OPTIMIZE flights ZORDER BY (DayofWeek)")
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'OPTIMIZE' expecting {'(', 'SELECT', 'FROM', 'ADD', 'DESC', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD'}(line 1, pos 0)
== SQL ==
OPTIMIZE flights ZORDER BY (DayofWeek)
^^^
  at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:241)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:117)
  at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
  ... 59 elided

My configuration are following:

$ spark-shell --version                                                                                    
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/_,_/_/ /_/_\   version 2.4.4
      /_/
Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_232
Branch 
Compiled by user  on 2019-08-27T21:21:38Z
Revision 
Url 
Type --help for more information.

I trigge my spark-shell using followings to use delta:

spark-shell --packages io.delta:delta-core_2.11:0.5.0

And I run this scala script (only different at the schema and path): https://gist.github.com/cakebytheoceanLuo/7056eb64907ac1263e46ce3e1afab852

The error happens at this line:

spark.sql("OPTIMIZE flights ZORDER BY (DayofWeek)")

Everything before this error is fine.

Thanks!

Marry Xmas!

3 REPLIES 3

Forum_Admin
Contributor

Thanks for help @Zonsan

BJGr
New Contributor II

Hi @Jigao Luo​ 

Did you ever get an answer to this?

Anonymous
Not applicable

Hi Jigao,

OPTIMIZE isn't in the open source delta API, so won't run on your local Spark instance - https://docs.delta.io/latest/api/scala/io/delta/tables/index.html?search=optimize

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.