Hi Guys,
I am working on streaming data movement from bronze to silver. My bronze table is having a entity_name column, based on the entity_name column i need to create multiple silver tables.
I tried the below approach, But it is failing with error "'GroupedData' object has no attribute 'get_group'"
Sample Code Snippet :
grouped_df = bronze_df.groupBy("entity_name")
entity_names = [row.PrimaryEntityName for row in grouped_df.agg({"entity_name": "first"}).collect()]
for entity_name in entity_names:
entity_df = grouped_df.get_group(entity_name)
I think where/filter clause can do the needful but efficiency wise it wont be a good solution in my pov. Is there anyother approach on doing this?
TIA.