Databricks

duliu · ‎04-02-2023

I configured databricks-connect locally and want to run spark code against a remote cluster.

I verified `databricks-connect test` passes and it can connect to remote databricks cluster.

However when I query a tables or read parquet from s3, it fails with `py4j.protocol.Py4JJavaError: An error occurred while calling o22.sql.

: java.lang.IllegalStateException: No api token found in local properties`.

The full stacktrace is pasted below

Traceback (most recent call last):
  File "/Users/<redacted>/Library/Application Support/JetBrains/Toolbox/apps/DataSpell/ch-0/223.8836.46/DataSpell.app/Contents/plugins/python-ce/helpers/pydev/pydevconsole.py", line 364, in runcode
    coro = func()
  File "<input>", line 1, in <module>
  File "/<redacted>/venv/lib/python3.8/site-packages/pyspark/sql/session.py", line 777, in sql
    return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped)
  File "/<redacted>/venv/lib/python3.8/site-packages/py4j/java_gateway.py", line 1304, in __call__
    return_value = get_return_value(
  File "/<redacted>/venv/lib/python3.8/site-packages/pyspark/sql/utils.py", line 117, in deco
    return f(*a, **kw)
  File "/<redacted>/venv/lib/python3.8/site-packages/py4j/protocol.py", line 326, in get_return_value
    raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling o22.sql.
: java.lang.IllegalStateException: No api token found in local properties
	at com.databricks.managedcatalog.ManagedCatalogClientImpl.$anonfun$getToken$1(ManagedCatalogClientImpl.scala:91)
	at scala.Option.getOrElse(Option.scala:189)
	at com.databricks.managedcatalog.ManagedCatalogClientImpl.getToken(ManagedCatalogClientImpl.scala:91)
	at com.databricks.managedcatalog.ManagedCatalogClientImpl.$anonfun$getCatalog$1(ManagedCatalogClientImpl.scala:160)
	at com.databricks.managedcatalog.ManagedCatalogClientImpl.recordMetastoreUsage(ManagedCatalogClientImpl.scala:1777)
	at com.databricks.managedcatalog.ManagedCatalogClientImpl.getCatalog(ManagedCatalogClientImpl.scala:153)
	at com.databricks.sql.managedcatalog.ManagedCatalogCommon.catalogExists(ManagedCatalogCommon.scala:86)
	at com.databricks.sql.managedcatalog.PermissionEnforcingManagedCatalog.catalogExists(PermissionEnforcingManagedCatalog.scala:151)
	at com.databricks.sql.managedcatalog.ManagedCatalogSessionCatalog.catalogExists(ManagedCatalogSessionCatalog.scala:354)
	at com.databricks.sql.DatabricksCatalogManager.isCatalogRegistered(DatabricksCatalogManager.scala:104)
	at org.apache.spark.sql.SparkServiceCatalogV2Handler$.catalogOperationV2(SparkServiceCatalogV2Handler.scala:58)
	at com.databricks.service.SparkServiceImpl$.$anonfun$catalogOperationV2$1(SparkServiceImpl.scala:165)
	at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:330)
	at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:424)
	at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:444)
	at com.databricks.logging.Log4jUsageLoggingShim$.$anonfun$withAttributionContext$1(Log4jUsageLoggingShim.scala:33)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:94)
	at com.databricks.logging.Log4jUsageLoggingShim$.withAttributionContext(Log4jUsageLoggingShim.scala:31)
	at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:205)
	at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:204)
	at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:20)
	at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:240)
	at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:225)
	at com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:20)
	at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:419)
	at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:339)
	at com.databricks.spark.util.PublicDBLogging.recordOperationWithResultTags(DatabricksSparkUsageLogger.scala:20)
	at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:330)
	at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:302)
	at com.databricks.spark.util.PublicDBLogging.recordOperation(DatabricksSparkUsageLogger.scala:20)
	at com.databricks.spark.util.PublicDBLogging.recordOperation0(DatabricksSparkUsageLogger.scala:57)
	at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:139)
	at com.databricks.spark.util.UsageLogger.recordOperation(UsageLogger.scala:73)
	at com.databricks.spark.util.UsageLogger.recordOperation$(UsageLogger.scala:60)
	at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:98)
	at com.databricks.spark.util.UsageLogging.recordOperation(UsageLogger.scala:431)
	at com.databricks.spark.util.UsageLogging.recordOperation$(UsageLogger.scala:410)
	at com.databricks.service.SparkServiceImpl$.recordOperation(SparkServiceImpl.scala:92)
	at com.databricks.service.SparkServiceImpl$.catalogOperationV2(SparkServiceImpl.scala:165)
	at com.databricks.service.SparkServiceRPCHandler.execute0(SparkServiceRPCHandler.scala:682)
	at com.databricks.service.SparkServiceRPCHandler.$anonfun$executeRPC0$1(SparkServiceRPCHandler.scala:477)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at com.databricks.service.SparkServiceRPCHandler.executeRPC0(SparkServiceRPCHandler.scala:372)
	at com.databricks.service.SparkServiceRPCHandler$$anon$2.call(SparkServiceRPCHandler.scala:323)
	at com.databricks.service.SparkServiceRPCHandler$$anon$2.call(SparkServiceRPCHandler.scala:309)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at com.databricks.service.SparkServiceRPCHandler.$anonfun$executeRPC$1(SparkServiceRPCHandler.scala:359)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at com.databricks.service.SparkServiceRPCHandler.executeRPC(SparkServiceRPCHandler.scala:336)
	at com.databricks.service.SparkServiceRPCServlet.doPost(SparkServiceRPCServer.scala:167)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:523)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:590)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:550)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
	at org.eclipse.jetty.server.Server.handle(Server.java:516)
	at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
	at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
	at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
	at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:386)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
	at java.lang.Thread.run(Thread.java:750)

can anyone help?

pvignesh92 · ‎04-03-2023

Hi @Du Liu , Have you added the personal-access-token to the config file? You can refer the below link for the steps

https://docs.databricks.com/dev-tools/cli/index.html

Debayan · ‎04-03-2023

Hi, How have you configured the authentication?

You can check: https://docs.databricks.com/dev-tools/api/latest/authentication.html

Please let us know if this helps.

Also, please tag @Debayan with your next response which will notify me. Thank you!

Anonymous · ‎04-03-2023

Hi @Du Liu

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.

We'd love to hear from you.

Thanks!

duliu · ‎04-05-2023

Hi,

I changed cluster access mode from "Assigned" to "No isolation shared" and that error message went away.

I think the problem is that Databricks Connect doesn't support Unity Catalog, and "No isolation shared" mode doesn't turn on Unity catalog. But the error message is confusing...

Anonymous · ‎04-17-2023

@Du Liu :

The error message suggests that there is no API token found in the local properties. This could be the cause of the failure when trying to access the tables or read parquet files from S3.

To fix this issue, you need to ensure that the API token is set in your local properties. Here are the steps to set the API token:

Open a terminal or command prompt.
Navigate to the directory where you installed the Databricks CLI.
Run the following command to authenticate your CLI:

databricks configure --token

This will prompt you to enter your Databricks API token.

Once you enter the token, it will be saved in your local properties file, which is located at /.databrickscfg.
After setting the API token, try running your Spark code again.

If the above steps do not work, try setting the DATABRICKS_TOKEN environment variable to the API token value. To set the environment variable, run the following command in your terminal or command prompt:

export DATABRICKS_TOKEN=<your-api-token>

Replace <your-api-token> with the actual API token value. Once you have set the environment variable, try running your Spark code again.

I hope this helps! Let me know if you have any other questions.

Robin_LOCHE · ‎05-05-2023

Hello! I have the same problem and your solution doesn't work. I've tried both .databrickscfg and DATABRICKS_TOKEN, still this "No api token found" when trying to access an external table on Unity Catalog, or doing anything with Unity for that matters (for example "use catalog").

However, if I read directly the delta table (the table is mounted on the dbfs) then there is no error.

The problem seems to be located to Unity Catalog itself, the connexion to spark have no problem because I can work fine if I avoid using unity tables.

It's still a problem for me, because we're trying to remove the direct mount of the tables, to force the use of Unity Catalog to manage access right, and with this bug we can't.

Databricks

databricks-connect fails with java.lang.IllegalStateException: No api token found in local properties

How to successfully build GenAI applications

Registration now open! Databricks Data + AI Summit 2024

Meet DBRX, the New Standard for High-Quality LLMs

Register now and save 50% on training at Data + AI Summit!