โ04-14-2022 02:34 AM
Using DBR 10 or later and Iโm getting an error when running the following query
SELECT * FROM delta.`s3://some_path`
getting org.apache.spark.SparkException: Unable to fetch tables of db delta
For 3.2.0+ they recommend reading like this:
CREATE TEMPORARY VIEW parquetTable
USING org.apache.spark.sql.parquet
OPTIONS (
path "examples/src/main/resources/people.parquet"
)
SELECT * FROM parquetTable
Can you confirm this is the only way?
โ05-11-2022 05:46 AM
Got support from Databricks.
Unfortunately, someone created a DB called delta, so the query was done against that DB instead.
Issue was solved
โ04-14-2022 07:37 AM
@Cristobal Bergerโ , Databricks uses dbfs, so if you want to use a path to read the data, you should use the dbfs path.
Using a view works too, btw (or define it as a table).
โ04-18-2022 02:02 AM
Hi @Werner Stinckensโ, thanks for replying.
Actually, you can read directly from S3 on PySpark and Spark SQL. Amaz on S3 documentation can show you how to do it. Now, it looks from Spark 3.2 (DBR 10 or later), it's not possible to use syntactic sugar on the FROM statement. That's what I need to confirm.
Thanks
โ05-04-2022 10:32 AM
Hello @Cristobal Bergerโ , - I could not reproduce this using DBR 10; I think you may be doing something wrong.
โ05-11-2022 05:46 AM
Got support from Databricks.
Unfortunately, someone created a DB called delta, so the query was done against that DB instead.
Issue was solved
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now