- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-16-2023 11:50 AM
Hello,
I have been working on this issue as a proof of concept - it would be extremely helpful to iterate through tables via loops in a few scenarios. I have a simple three column dimension that I added to a cached table.
cache lazy table hedis_cache select * from hofhc.hedis_dim
I then tried the following two methods. The first is coming up empty, whereas the second is returning data as a dataframe, not a python dictionary.
Any advice? Thanks in advance for all the help!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-16-2023 10:35 PM
@Andrew Begg
First of all, cache table creates a view, not table, so you won't be able to use pd.read_table (https://docs.databricks.com/sql/language-manual/sql-ref-syntax-aux-cache-cache-table.html).
About the second method - .select() still gives a dataframe as an output. You need to find a way to convert DF to dict.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-16-2023 10:35 PM
@Andrew Begg
First of all, cache table creates a view, not table, so you won't be able to use pd.read_table (https://docs.databricks.com/sql/language-manual/sql-ref-syntax-aux-cache-cache-table.html).
About the second method - .select() still gives a dataframe as an output. You need to find a way to convert DF to dict.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-17-2023 07:34 AM
Got it to work, thank you for the tip! I needed to convert the dataframe over to a pandas dataframe
https://www.geeksforgeeks.org/convert-pyspark-dataframe-to-dictionary-in-python/
![](/skins/images/B38AF44D4BD6CE643D2A527BE673CCF6/responsive_peak/images/icon_anonymous_message.png)
![](/skins/images/B38AF44D4BD6CE643D2A527BE673CCF6/responsive_peak/images/icon_anonymous_message.png)