<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Python: Generate new dfs from a list of dataframes using for loop in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/python-generate-new-dfs-from-a-list-of-dataframes-using-for-loop/m-p/21651#M14784</link>
    <description>&lt;P&gt;thanks&lt;/P&gt;</description>
    <pubDate>Fri, 02 Dec 2022 09:44:45 GMT</pubDate>
    <dc:creator>Aviral-Bhardwaj</dc:creator>
    <dc:date>2022-12-02T09:44:45Z</dc:date>
    <item>
      <title>Python: Generate new dfs from a list of dataframes using for loop</title>
      <link>https://community.databricks.com/t5/data-engineering/python-generate-new-dfs-from-a-list-of-dataframes-using-for-loop/m-p/21650#M14783</link>
      <description>&lt;P&gt;I have a list of dataframes (for this example 2) and want to apply a for-loop to the list of frames to generate 2 new dataframes. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To start, here is my starting dataframe called df_final:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="df_long"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/1890i377ADE15D0D6A2CF/image-size/large?v=v2&amp;amp;px=999" role="button" title="df_long" alt="df_long" /&gt;&lt;/span&gt;First, I create 2 dataframes:&amp;nbsp;df2_&lt;B&gt;b2c&lt;/B&gt;_fast, df2_&lt;B&gt;b2b&lt;/B&gt;_fast:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;for x in df_final['b2b_b2c_prod'].unique():
    locals()['df2_' + x ] = df_final[(df_final['b2b_b2c_prod'] == x ) ]&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Second I calculate the correlation between ADV and pkg_yld:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;df_corrs_fast_b2c=df2_b2c_fast.groupby(['bus_nm','id']).corr(method='spearman').unstack().iloc[:,1]
df_corrs_fast_b2b=df2_b2b_fast.groupby(['bus_nm','id']).corr(method='spearman').unstack().iloc[:,1]&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;Third I convert each to dataframes:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;corrs_b2b_fast = (df2_b2b_fast[['adv', 'id']]
        .groupby('id')
        .corrwith(df1['pkg_yld'])
        .rename(columns={'adv' : 'correl'})
        .reset_index())     
 
corrs_b2c_fast = (df2_b2c_fast[['adv', 'id']]
        .groupby('id')
        .corrwith(df1['pkg_yld'])
        .rename(columns={'adv' : 'correl'})
        .reset_index())&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Here is one of the 2 dataframes corrs_b2c_fast:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="view"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/1905i88CF1921E3EB0D76/image-size/large?v=v2&amp;amp;px=999" role="button" title="view" alt="view" /&gt;&lt;/span&gt;&lt;B&gt;&lt;U&gt;Question:&lt;/U&gt;&lt;/B&gt; I want to consolidate the steps where (1) I extract my 2 dataframes, (2) estimate the correlations and (3) convert to 2 dataframes using a for loop (corrs_b2b_fast, corrs_b2c_fast); I started below but got stuck:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;df_list=[df2_b2b_fast, df2_b2c_fast]      # Subset of dfs
&amp;nbsp;
for x in df_list['b2b_b2c_prod']:  
&amp;nbsp;
locals()['corrs_' + x ] = df_list[(df_list['b2b_b2c_prod'] == x ) ]     # Create new 2 dfs from main df 'b2b_b2c_prod'
x= x.groupby(['bus_nm','id']).corr(method='spearman').unstack().iloc[:,1]  # calculate corr between pkg_yld and ADV 
&amp;nbsp;
  # stuck here...lines needed to create dataframes corrs_b2b_fast, corrs_b2c_fast??&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;What's wrong?&lt;/P&gt;</description>
      <pubDate>Mon, 02 May 2022 13:43:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/python-generate-new-dfs-from-a-list-of-dataframes-using-for-loop/m-p/21650#M14783</guid>
      <dc:creator>Jack</dc:creator>
      <dc:date>2022-05-02T13:43:59Z</dc:date>
    </item>
    <item>
      <title>Re: Python: Generate new dfs from a list of dataframes using for loop</title>
      <link>https://community.databricks.com/t5/data-engineering/python-generate-new-dfs-from-a-list-of-dataframes-using-for-loop/m-p/21651#M14784</link>
      <description>&lt;P&gt;thanks&lt;/P&gt;</description>
      <pubDate>Fri, 02 Dec 2022 09:44:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/python-generate-new-dfs-from-a-list-of-dataframes-using-for-loop/m-p/21651#M14784</guid>
      <dc:creator>Aviral-Bhardwaj</dc:creator>
      <dc:date>2022-12-02T09:44:45Z</dc:date>
    </item>
  </channel>
</rss>

