ā04-14-2022 02:34 AM
Using DBR 10 or later and Iām getting an error when running the following query
SELECT * FROM delta.`s3://some_path`
getting org.apache.spark.SparkException: Unable to fetch tables of db delta
For 3.2.0+ they recommend reading like this:
CREATE TEMPORARY VIEW parquetTable
USING org.apache.spark.sql.parquet
OPTIONS (
path "examples/src/main/resources/people.parquet"
)
SELECT * FROM parquetTable
Can you confirm this is the only way?
ā05-11-2022 05:46 AM
Got support from Databricks.
Unfortunately, someone created a DB called delta, so the query was done against that DB instead.
Issue was solved
ā04-14-2022 07:37 AM
@Cristobal Bergerā , Databricks uses dbfs, so if you want to use a path to read the data, you should use the dbfs path.
Using a view works too, btw (or define it as a table).
ā04-18-2022 02:02 AM
Hi @Werner Stinckensā, thanks for replying.
Actually, you can read directly from S3 on PySpark and Spark SQL. Amaz on S3 documentation can show you how to do it. Now, it looks from Spark 3.2 (DBR 10 or later), it's not possible to use syntactic sugar on the FROM statement. That's what I need to confirm.
Thanks
ā05-04-2022 10:32 AM
ā05-11-2022 04:30 AM
Hi @Cristobal Bergerā , Just a friendly follow-up. Do you still need help? Please let us know.
ā05-11-2022 05:46 AM
Got support from Databricks.
Unfortunately, someone created a DB called delta, so the query was done against that DB instead.
Issue was solved
ā05-11-2022 08:13 AM
Thank you for the update @Cristobal Bergerā !
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.