Pandas: how to merge to dataframes on multiple columns?



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








3















I have 2 dataframes, df1 and df2.



df1 Contains the information of some interactions between people.



df1
Name1 Name2
0 Jack John
1 Sarah Jack
2 Sarah Eva
3 Eva Tom
4 Eva John


df2 Contains the status of general people and also some people in df1



df2
Name Y
0 Jack 0
1 John 1
2 Sarah 0
3 Tom 1
4 Laura 0


I would like df2 only for the people that are in df1 (Laura disappears), and for those that are not in df2 keep NaN (i.e. Eva) such as:



df2
Name Y
0 Jack 0
1 John 1
2 Sarah 0
3 Tom 1
4 Eva NaN









share|improve this question






















  • Please share your dfs as df.to_dict()

    – user32185
    Nov 15 '18 at 11:51

















3















I have 2 dataframes, df1 and df2.



df1 Contains the information of some interactions between people.



df1
Name1 Name2
0 Jack John
1 Sarah Jack
2 Sarah Eva
3 Eva Tom
4 Eva John


df2 Contains the status of general people and also some people in df1



df2
Name Y
0 Jack 0
1 John 1
2 Sarah 0
3 Tom 1
4 Laura 0


I would like df2 only for the people that are in df1 (Laura disappears), and for those that are not in df2 keep NaN (i.e. Eva) such as:



df2
Name Y
0 Jack 0
1 John 1
2 Sarah 0
3 Tom 1
4 Eva NaN









share|improve this question






















  • Please share your dfs as df.to_dict()

    – user32185
    Nov 15 '18 at 11:51













3












3








3








I have 2 dataframes, df1 and df2.



df1 Contains the information of some interactions between people.



df1
Name1 Name2
0 Jack John
1 Sarah Jack
2 Sarah Eva
3 Eva Tom
4 Eva John


df2 Contains the status of general people and also some people in df1



df2
Name Y
0 Jack 0
1 John 1
2 Sarah 0
3 Tom 1
4 Laura 0


I would like df2 only for the people that are in df1 (Laura disappears), and for those that are not in df2 keep NaN (i.e. Eva) such as:



df2
Name Y
0 Jack 0
1 John 1
2 Sarah 0
3 Tom 1
4 Eva NaN









share|improve this question














I have 2 dataframes, df1 and df2.



df1 Contains the information of some interactions between people.



df1
Name1 Name2
0 Jack John
1 Sarah Jack
2 Sarah Eva
3 Eva Tom
4 Eva John


df2 Contains the status of general people and also some people in df1



df2
Name Y
0 Jack 0
1 John 1
2 Sarah 0
3 Tom 1
4 Laura 0


I would like df2 only for the people that are in df1 (Laura disappears), and for those that are not in df2 keep NaN (i.e. Eva) such as:



df2
Name Y
0 Jack 0
1 John 1
2 Sarah 0
3 Tom 1
4 Eva NaN






python pandas dataframe






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 15 '18 at 11:18









emaxemax

1,22531235




1,22531235












  • Please share your dfs as df.to_dict()

    – user32185
    Nov 15 '18 at 11:51

















  • Please share your dfs as df.to_dict()

    – user32185
    Nov 15 '18 at 11:51
















Please share your dfs as df.to_dict()

– user32185
Nov 15 '18 at 11:51





Please share your dfs as df.to_dict()

– user32185
Nov 15 '18 at 11:51












2 Answers
2






active

oldest

votes


















2














Create a DataFrame on unique values of df1 and map it with df2 as:



df = pd.DataFrame(np.unique(df1.values),columns=['Name'])
df['Y'] = df.Name.map(df2.set_index('Name')['Y'])

print(df)
Name Y
0 Eva NaN
1 Jack 0.0
2 John 1.0
3 Sarah 0.0
4 Tom 1.0


Note : Order is not preserved.






share|improve this answer






























    0














    You can create a list of unique names in df1 and use isin



    names = np.unique(df1[['Name1', 'Name2']].values.ravel())
    df2.loc[~df2['Name'].isin(names), 'Y'] = np.nan

    Name Y
    0 Jack 0.0
    1 John 1.0
    2 Sarah 0.0
    3 Tom 1.0
    4 Laura NaN





    share|improve this answer























      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53318305%2fpandas-how-to-merge-to-dataframes-on-multiple-columns%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      2 Answers
      2






      active

      oldest

      votes








      2 Answers
      2






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      2














      Create a DataFrame on unique values of df1 and map it with df2 as:



      df = pd.DataFrame(np.unique(df1.values),columns=['Name'])
      df['Y'] = df.Name.map(df2.set_index('Name')['Y'])

      print(df)
      Name Y
      0 Eva NaN
      1 Jack 0.0
      2 John 1.0
      3 Sarah 0.0
      4 Tom 1.0


      Note : Order is not preserved.






      share|improve this answer



























        2














        Create a DataFrame on unique values of df1 and map it with df2 as:



        df = pd.DataFrame(np.unique(df1.values),columns=['Name'])
        df['Y'] = df.Name.map(df2.set_index('Name')['Y'])

        print(df)
        Name Y
        0 Eva NaN
        1 Jack 0.0
        2 John 1.0
        3 Sarah 0.0
        4 Tom 1.0


        Note : Order is not preserved.






        share|improve this answer

























          2












          2








          2







          Create a DataFrame on unique values of df1 and map it with df2 as:



          df = pd.DataFrame(np.unique(df1.values),columns=['Name'])
          df['Y'] = df.Name.map(df2.set_index('Name')['Y'])

          print(df)
          Name Y
          0 Eva NaN
          1 Jack 0.0
          2 John 1.0
          3 Sarah 0.0
          4 Tom 1.0


          Note : Order is not preserved.






          share|improve this answer













          Create a DataFrame on unique values of df1 and map it with df2 as:



          df = pd.DataFrame(np.unique(df1.values),columns=['Name'])
          df['Y'] = df.Name.map(df2.set_index('Name')['Y'])

          print(df)
          Name Y
          0 Eva NaN
          1 Jack 0.0
          2 John 1.0
          3 Sarah 0.0
          4 Tom 1.0


          Note : Order is not preserved.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 15 '18 at 11:25









          Sandeep KadapaSandeep Kadapa

          7,408831




          7,408831























              0














              You can create a list of unique names in df1 and use isin



              names = np.unique(df1[['Name1', 'Name2']].values.ravel())
              df2.loc[~df2['Name'].isin(names), 'Y'] = np.nan

              Name Y
              0 Jack 0.0
              1 John 1.0
              2 Sarah 0.0
              3 Tom 1.0
              4 Laura NaN





              share|improve this answer



























                0














                You can create a list of unique names in df1 and use isin



                names = np.unique(df1[['Name1', 'Name2']].values.ravel())
                df2.loc[~df2['Name'].isin(names), 'Y'] = np.nan

                Name Y
                0 Jack 0.0
                1 John 1.0
                2 Sarah 0.0
                3 Tom 1.0
                4 Laura NaN





                share|improve this answer

























                  0












                  0








                  0







                  You can create a list of unique names in df1 and use isin



                  names = np.unique(df1[['Name1', 'Name2']].values.ravel())
                  df2.loc[~df2['Name'].isin(names), 'Y'] = np.nan

                  Name Y
                  0 Jack 0.0
                  1 John 1.0
                  2 Sarah 0.0
                  3 Tom 1.0
                  4 Laura NaN





                  share|improve this answer













                  You can create a list of unique names in df1 and use isin



                  names = np.unique(df1[['Name1', 'Name2']].values.ravel())
                  df2.loc[~df2['Name'].isin(names), 'Y'] = np.nan

                  Name Y
                  0 Jack 0.0
                  1 John 1.0
                  2 Sarah 0.0
                  3 Tom 1.0
                  4 Laura NaN






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Nov 15 '18 at 17:12









                  VaishaliVaishali

                  22.7k41438




                  22.7k41438



























                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Stack Overflow!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53318305%2fpandas-how-to-merge-to-dataframes-on-multiple-columns%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown







                      Popular posts from this blog

                      How to how show current date and time by default on contact form 7 in WordPress without taking input from user in datetimepicker

                      Syphilis

                      Darth Vader #20