Cumulative list of a column using groupby









up vote
2
down vote

favorite












Hi so I have the following dataframe:



 Fruit metric
0 Apple NaN
1 Apple 100.0
2 Apple NaN
3 Peach 70.0
4 Pear 120.0
5 Pear 100.0
6 Pear NaN


My objective is to groupby fruit and in order, add each value of metric that is not null to a cumulative list with its own separate column like so:



 Fruit metric metric_cum
0 Apple NaN
1 Apple 100.0 [100]
2 Apple NaN [100]
3 Peach 70.0 [70]
4 Pear 120.0 [120]
5 Pear 100.0 [120, 100]
6 Pear NaN [120, 100]


I have tried doing this:



df['metric1'] = df['metric'].astype(str)
df.groupby('Fruit')['metric1'].cumsum()


But this results in a DataError: No numeric types to aggregate.



I have also tried doing this:



df.groupby('Fruit')['metric'].apply(list)


Resulting in:



Fruit
Apple [nan, 100.0, nan]
Peach [70.0]
Pear [120.0, 100.0, nan]
Name: metric, dtype: object


But this is not cumulative and isn't able to made into a column.
Thanks for your help










share|improve this question

























    up vote
    2
    down vote

    favorite












    Hi so I have the following dataframe:



     Fruit metric
    0 Apple NaN
    1 Apple 100.0
    2 Apple NaN
    3 Peach 70.0
    4 Pear 120.0
    5 Pear 100.0
    6 Pear NaN


    My objective is to groupby fruit and in order, add each value of metric that is not null to a cumulative list with its own separate column like so:



     Fruit metric metric_cum
    0 Apple NaN
    1 Apple 100.0 [100]
    2 Apple NaN [100]
    3 Peach 70.0 [70]
    4 Pear 120.0 [120]
    5 Pear 100.0 [120, 100]
    6 Pear NaN [120, 100]


    I have tried doing this:



    df['metric1'] = df['metric'].astype(str)
    df.groupby('Fruit')['metric1'].cumsum()


    But this results in a DataError: No numeric types to aggregate.



    I have also tried doing this:



    df.groupby('Fruit')['metric'].apply(list)


    Resulting in:



    Fruit
    Apple [nan, 100.0, nan]
    Peach [70.0]
    Pear [120.0, 100.0, nan]
    Name: metric, dtype: object


    But this is not cumulative and isn't able to made into a column.
    Thanks for your help










    share|improve this question























      up vote
      2
      down vote

      favorite









      up vote
      2
      down vote

      favorite











      Hi so I have the following dataframe:



       Fruit metric
      0 Apple NaN
      1 Apple 100.0
      2 Apple NaN
      3 Peach 70.0
      4 Pear 120.0
      5 Pear 100.0
      6 Pear NaN


      My objective is to groupby fruit and in order, add each value of metric that is not null to a cumulative list with its own separate column like so:



       Fruit metric metric_cum
      0 Apple NaN
      1 Apple 100.0 [100]
      2 Apple NaN [100]
      3 Peach 70.0 [70]
      4 Pear 120.0 [120]
      5 Pear 100.0 [120, 100]
      6 Pear NaN [120, 100]


      I have tried doing this:



      df['metric1'] = df['metric'].astype(str)
      df.groupby('Fruit')['metric1'].cumsum()


      But this results in a DataError: No numeric types to aggregate.



      I have also tried doing this:



      df.groupby('Fruit')['metric'].apply(list)


      Resulting in:



      Fruit
      Apple [nan, 100.0, nan]
      Peach [70.0]
      Pear [120.0, 100.0, nan]
      Name: metric, dtype: object


      But this is not cumulative and isn't able to made into a column.
      Thanks for your help










      share|improve this question













      Hi so I have the following dataframe:



       Fruit metric
      0 Apple NaN
      1 Apple 100.0
      2 Apple NaN
      3 Peach 70.0
      4 Pear 120.0
      5 Pear 100.0
      6 Pear NaN


      My objective is to groupby fruit and in order, add each value of metric that is not null to a cumulative list with its own separate column like so:



       Fruit metric metric_cum
      0 Apple NaN
      1 Apple 100.0 [100]
      2 Apple NaN [100]
      3 Peach 70.0 [70]
      4 Pear 120.0 [120]
      5 Pear 100.0 [120, 100]
      6 Pear NaN [120, 100]


      I have tried doing this:



      df['metric1'] = df['metric'].astype(str)
      df.groupby('Fruit')['metric1'].cumsum()


      But this results in a DataError: No numeric types to aggregate.



      I have also tried doing this:



      df.groupby('Fruit')['metric'].apply(list)


      Resulting in:



      Fruit
      Apple [nan, 100.0, nan]
      Peach [70.0]
      Pear [120.0, 100.0, nan]
      Name: metric, dtype: object


      But this is not cumulative and isn't able to made into a column.
      Thanks for your help







      python list pandas dataframe






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Jun 23 '17 at 11:03









      user3374113

      123415




      123415






















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          5
          down vote



          accepted










          Use:



          df['metric'] = df['metric'].apply(lambda x: if pd.isnull(x) else [int(x)])
          df['metric_cum'] = df.groupby('Fruit')['metric'].apply(lambda x: x.cumsum())
          print (df)
          Fruit metric metric_cum
          0 Apple
          1 Apple [100] [100]
          2 Apple [100]
          3 Peach [70] [70]
          4 Pear [120] [120]
          5 Pear [100] [120, 100]
          6 Pear [120, 100]


          Or:



          a = df['metric'].apply(lambda x: if pd.isnull(x) else [int(x)])
          df['metric_cum'] = a.groupby(df['Fruit']).apply(lambda x: x.cumsum())
          print (df)
          Fruit metric metric_cum
          0 Apple NaN
          1 Apple 100.0 [100]
          2 Apple NaN [100]
          3 Peach 70.0 [70]
          4 Pear 120.0 [120]
          5 Pear 100.0 [120, 100]
          6 Pear NaN [120, 100]





          share|improve this answer



























            up vote
            2
            down vote













            f = lambda x: pd.Series(x).dropna().astype(int).tolist()
            c = pd.Series.cumsum
            df.assign(metric_cum=df.metric.apply(f).groupby(df.Fruit).apply(c))

            Fruit metric metric_cum
            0 Apple NaN
            1 Apple 100.0 [100]
            2 Apple NaN [100]
            3 Peach 70.0 [70]
            4 Pear 120.0 [120]
            5 Pear 100.0 [120, 100]
            6 Pear NaN [120, 100]





            share|improve this answer




















              Your Answer






              StackExchange.ifUsing("editor", function ()
              StackExchange.using("externalEditor", function ()
              StackExchange.using("snippets", function ()
              StackExchange.snippets.init();
              );
              );
              , "code-snippets");

              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "1"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader:
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              ,
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f44719855%2fcumulative-list-of-a-column-using-groupby%23new-answer', 'question_page');

              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              5
              down vote



              accepted










              Use:



              df['metric'] = df['metric'].apply(lambda x: if pd.isnull(x) else [int(x)])
              df['metric_cum'] = df.groupby('Fruit')['metric'].apply(lambda x: x.cumsum())
              print (df)
              Fruit metric metric_cum
              0 Apple
              1 Apple [100] [100]
              2 Apple [100]
              3 Peach [70] [70]
              4 Pear [120] [120]
              5 Pear [100] [120, 100]
              6 Pear [120, 100]


              Or:



              a = df['metric'].apply(lambda x: if pd.isnull(x) else [int(x)])
              df['metric_cum'] = a.groupby(df['Fruit']).apply(lambda x: x.cumsum())
              print (df)
              Fruit metric metric_cum
              0 Apple NaN
              1 Apple 100.0 [100]
              2 Apple NaN [100]
              3 Peach 70.0 [70]
              4 Pear 120.0 [120]
              5 Pear 100.0 [120, 100]
              6 Pear NaN [120, 100]





              share|improve this answer
























                up vote
                5
                down vote



                accepted










                Use:



                df['metric'] = df['metric'].apply(lambda x: if pd.isnull(x) else [int(x)])
                df['metric_cum'] = df.groupby('Fruit')['metric'].apply(lambda x: x.cumsum())
                print (df)
                Fruit metric metric_cum
                0 Apple
                1 Apple [100] [100]
                2 Apple [100]
                3 Peach [70] [70]
                4 Pear [120] [120]
                5 Pear [100] [120, 100]
                6 Pear [120, 100]


                Or:



                a = df['metric'].apply(lambda x: if pd.isnull(x) else [int(x)])
                df['metric_cum'] = a.groupby(df['Fruit']).apply(lambda x: x.cumsum())
                print (df)
                Fruit metric metric_cum
                0 Apple NaN
                1 Apple 100.0 [100]
                2 Apple NaN [100]
                3 Peach 70.0 [70]
                4 Pear 120.0 [120]
                5 Pear 100.0 [120, 100]
                6 Pear NaN [120, 100]





                share|improve this answer






















                  up vote
                  5
                  down vote



                  accepted







                  up vote
                  5
                  down vote



                  accepted






                  Use:



                  df['metric'] = df['metric'].apply(lambda x: if pd.isnull(x) else [int(x)])
                  df['metric_cum'] = df.groupby('Fruit')['metric'].apply(lambda x: x.cumsum())
                  print (df)
                  Fruit metric metric_cum
                  0 Apple
                  1 Apple [100] [100]
                  2 Apple [100]
                  3 Peach [70] [70]
                  4 Pear [120] [120]
                  5 Pear [100] [120, 100]
                  6 Pear [120, 100]


                  Or:



                  a = df['metric'].apply(lambda x: if pd.isnull(x) else [int(x)])
                  df['metric_cum'] = a.groupby(df['Fruit']).apply(lambda x: x.cumsum())
                  print (df)
                  Fruit metric metric_cum
                  0 Apple NaN
                  1 Apple 100.0 [100]
                  2 Apple NaN [100]
                  3 Peach 70.0 [70]
                  4 Pear 120.0 [120]
                  5 Pear 100.0 [120, 100]
                  6 Pear NaN [120, 100]





                  share|improve this answer












                  Use:



                  df['metric'] = df['metric'].apply(lambda x: if pd.isnull(x) else [int(x)])
                  df['metric_cum'] = df.groupby('Fruit')['metric'].apply(lambda x: x.cumsum())
                  print (df)
                  Fruit metric metric_cum
                  0 Apple
                  1 Apple [100] [100]
                  2 Apple [100]
                  3 Peach [70] [70]
                  4 Pear [120] [120]
                  5 Pear [100] [120, 100]
                  6 Pear [120, 100]


                  Or:



                  a = df['metric'].apply(lambda x: if pd.isnull(x) else [int(x)])
                  df['metric_cum'] = a.groupby(df['Fruit']).apply(lambda x: x.cumsum())
                  print (df)
                  Fruit metric metric_cum
                  0 Apple NaN
                  1 Apple 100.0 [100]
                  2 Apple NaN [100]
                  3 Peach 70.0 [70]
                  4 Pear 120.0 [120]
                  5 Pear 100.0 [120, 100]
                  6 Pear NaN [120, 100]






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Jun 23 '17 at 11:17









                  jezrael

                  306k20239314




                  306k20239314






















                      up vote
                      2
                      down vote













                      f = lambda x: pd.Series(x).dropna().astype(int).tolist()
                      c = pd.Series.cumsum
                      df.assign(metric_cum=df.metric.apply(f).groupby(df.Fruit).apply(c))

                      Fruit metric metric_cum
                      0 Apple NaN
                      1 Apple 100.0 [100]
                      2 Apple NaN [100]
                      3 Peach 70.0 [70]
                      4 Pear 120.0 [120]
                      5 Pear 100.0 [120, 100]
                      6 Pear NaN [120, 100]





                      share|improve this answer
























                        up vote
                        2
                        down vote













                        f = lambda x: pd.Series(x).dropna().astype(int).tolist()
                        c = pd.Series.cumsum
                        df.assign(metric_cum=df.metric.apply(f).groupby(df.Fruit).apply(c))

                        Fruit metric metric_cum
                        0 Apple NaN
                        1 Apple 100.0 [100]
                        2 Apple NaN [100]
                        3 Peach 70.0 [70]
                        4 Pear 120.0 [120]
                        5 Pear 100.0 [120, 100]
                        6 Pear NaN [120, 100]





                        share|improve this answer






















                          up vote
                          2
                          down vote










                          up vote
                          2
                          down vote









                          f = lambda x: pd.Series(x).dropna().astype(int).tolist()
                          c = pd.Series.cumsum
                          df.assign(metric_cum=df.metric.apply(f).groupby(df.Fruit).apply(c))

                          Fruit metric metric_cum
                          0 Apple NaN
                          1 Apple 100.0 [100]
                          2 Apple NaN [100]
                          3 Peach 70.0 [70]
                          4 Pear 120.0 [120]
                          5 Pear 100.0 [120, 100]
                          6 Pear NaN [120, 100]





                          share|improve this answer












                          f = lambda x: pd.Series(x).dropna().astype(int).tolist()
                          c = pd.Series.cumsum
                          df.assign(metric_cum=df.metric.apply(f).groupby(df.Fruit).apply(c))

                          Fruit metric metric_cum
                          0 Apple NaN
                          1 Apple 100.0 [100]
                          2 Apple NaN [100]
                          3 Peach 70.0 [70]
                          4 Pear 120.0 [120]
                          5 Pear 100.0 [120, 100]
                          6 Pear NaN [120, 100]






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Jun 23 '17 at 11:19









                          piRSquared

                          148k21132268




                          148k21132268



























                               

                              draft saved


                              draft discarded















































                               


                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f44719855%2fcumulative-list-of-a-column-using-groupby%23new-answer', 'question_page');

                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              How to how show current date and time by default on contact form 7 in WordPress without taking input from user in datetimepicker

                              Syphilis

                              Darth Vader #20