How to access tables created in 2017

Bill
New Contributor III

In 2017 while working on my Masters degree, I created some tables that I would like to access again. Back then I could just write SQL and find them but today that doesn't work. I suspect it has something to do with Delta Lake.

What do I have to do to gain access to these tables?

Hubert-Dudek
Databricks MVP

Hi, it is parquet files. Just load that data as DataFrame:

df = spark.read.parquet("dbfs:/hive/warehouse/congresskmean/")

Eventually, you can register that files as a table but you will need to specify the schema:

CREATE TABLE table_name

(schema)

USING PARQUET

LOCATION '/hive/warehouse/congresskmean/';


My blog: https://databrickster.medium.com/

Bill
New Contributor III

Sorry, I forgot to mention that I am doing this in R and this Python function returns a pyspark.sql.dataframe.DataFrame type which I can't access in R. But with the information you provided I found some code that was supposed to work for R but the install.packages fails.

install.packages("arrow")

library(arrow)

read_parquet("myfile.parquet")

install.packages("arrow", repos = "https://arrow-r-nightly.s3.amazonaws.com")

Developer version fails to install also.

You don't need to install anything. You can just, for example, from R using SQL register that table as in the above example. You can read as dataframe in R also out of the box. Many examples here:

https://docs.databricks.com/spark/latest/sparkr/overview.html


My blog: https://databrickster.medium.com/

View solution in original post

Bill
New Contributor III

That did it. Thanks

Great that it helped. If you can you can select my answer as the best one 🙂


My blog: https://databrickster.medium.com/