โ07-22-2024 07:44 AM
If we want to read all the data of the databricks tables at single time how can we able to do it.
โ07-22-2024 07:59 AM - edited โ07-22-2024 08:02 AM
Hi @Krishna2110 ,
It's bit unclear to me what is your problem. If you don't use any filter then all data will be read in data frame, as in below.
df = spark.read.delta('my_table')
There is however limitation on number of rows that will be displayed in UI, so maybe you're thinking that not entire data were read?
Or maybe you are asking about situation, where you have set of different tables with the same schema and you would like to query those? Then you can iterate on tables, read each and union results
โ07-22-2024 08:10 AM
Thankyou for your input.
In the same catalog if there are 40 tables i want to read the data or either schema of all the tables in the same command cell with the help of pyspark.
I have written this code but it was throwing an error
โ07-22-2024 08:30 AM
Yeah, sure. I'll send you code once I'm home
โ07-22-2024 09:00 AM
Hi @Krishna2110 ,
Here it is, it should work now
tables = spark.sql("SHOW TABLES IN ewt_edp_prod.crm_raw").collect()
for row in tables:
table_name = f"ewt_edp_prod.{row[0]}.{row[1]}"
try:
df = spark.table(table_name)
count = df.count()
print(f"Table {table_name} is accessible and has {count} rows.")
except Exception as e:
print(f"Error accessing table {table_name}: {e}")
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now