cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

OPTIMIZE error: org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'OPTIMIZE'

JigaoLuo
New Contributor

Hi everyone.

I am trying to learn the keyword OPTIMIZE from this blog using scala: https://docs.databricks.com/delta/optimizations/optimization-examples.html#delta-lake-on-databricks-....

But my local spark seems not able to parse the OPTIMIZE and gave me followings:

scala> spark.sql("OPTIMIZE flights ZORDER BY (DayofWeek)")
org.apache.spark.sql.catalyst.parser.ParseException:
mismatched input 'OPTIMIZE' expecting {'(', 'SELECT', 'FROM', 'ADD', 'DESC', 'WITH', 'VALUES', 'CREATE', 'TABLE', 'INSERT', 'DELETE', 'DESCRIBE', 'EXPLAIN', 'SHOW', 'USE', 'DROP', 'ALTER', 'MAP', 'SET', 'RESET', 'START', 'COMMIT', 'ROLLBACK', 'REDUCE', 'REFRESH', 'CLEAR', 'CACHE', 'UNCACHE', 'DFS', 'TRUNCATE', 'ANALYZE', 'LIST', 'REVOKE', 'GRANT', 'LOCK', 'UNLOCK', 'MSCK', 'EXPORT', 'IMPORT', 'LOAD'}(line 1, pos 0)
== SQL ==
OPTIMIZE flights ZORDER BY (DayofWeek)
^^^
  at org.apache.spark.sql.catalyst.parser.ParseException.withCommand(ParseDriver.scala:241)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:117)
  at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
  at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parsePlan(ParseDriver.scala:69)
  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
  ... 59 elided

My configuration are following:

$ spark-shell --version                                                                                    
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/_,_/_/ /_/_\   version 2.4.4
      /_/
Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_232
Branch 
Compiled by user  on 2019-08-27T21:21:38Z
Revision 
Url 
Type --help for more information.

I trigge my spark-shell using followings to use delta:

spark-shell --packages io.delta:delta-core_2.11:0.5.0

And I run this scala script (only different at the schema and path): https://gist.github.com/cakebytheoceanLuo/7056eb64907ac1263e46ce3e1afab852

The error happens at this line:

spark.sql("OPTIMIZE flights ZORDER BY (DayofWeek)")

Everything before this error is fine.

Thanks!

Marry Xmas!

3 REPLIES 3

Forum_Admin
Contributor

Thanks for help @Zonsan

BJGr
New Contributor II

Hi @Jigao Luo​ 

Did you ever get an answer to this?

Anonymous
Not applicable

Hi Jigao,

OPTIMIZE isn't in the open source delta API, so won't run on your local Spark instance - https://docs.delta.io/latest/api/scala/io/delta/tables/index.html?search=optimize

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group