Using tidyverse to loop all rows and identify (and keep) only the higher value










0















I'm working with people from psychology and factor analysis is a typical procedure within this area. I have a dataset like the following one



Original dataset



I wanna preserve the highest value only each row while transforming all other values in missing cases



New dataset



I aware dplyr can solve this problem easily, but I could not find a simple code to perform that.



Please, check the code below to reproduce this question:



library(tidyverse)
set.seed(123)
ds <- data.frame(x1 = runif(10,min = .1,.29),x2 = runif(10,min = .1,.35), x3 = runif(10,min = .1,.38))
ds <- ds %>% mutate_all(funs(round(.,3)))

ds


Please, keep in mind this question can help other people with the same (or similar) problems. I searched before asking and I found just one close topic here



Thanks much.










share|improve this question


























    0















    I'm working with people from psychology and factor analysis is a typical procedure within this area. I have a dataset like the following one



    Original dataset



    I wanna preserve the highest value only each row while transforming all other values in missing cases



    New dataset



    I aware dplyr can solve this problem easily, but I could not find a simple code to perform that.



    Please, check the code below to reproduce this question:



    library(tidyverse)
    set.seed(123)
    ds <- data.frame(x1 = runif(10,min = .1,.29),x2 = runif(10,min = .1,.35), x3 = runif(10,min = .1,.38))
    ds <- ds %>% mutate_all(funs(round(.,3)))

    ds


    Please, keep in mind this question can help other people with the same (or similar) problems. I searched before asking and I found just one close topic here



    Thanks much.










    share|improve this question
























      0












      0








      0








      I'm working with people from psychology and factor analysis is a typical procedure within this area. I have a dataset like the following one



      Original dataset



      I wanna preserve the highest value only each row while transforming all other values in missing cases



      New dataset



      I aware dplyr can solve this problem easily, but I could not find a simple code to perform that.



      Please, check the code below to reproduce this question:



      library(tidyverse)
      set.seed(123)
      ds <- data.frame(x1 = runif(10,min = .1,.29),x2 = runif(10,min = .1,.35), x3 = runif(10,min = .1,.38))
      ds <- ds %>% mutate_all(funs(round(.,3)))

      ds


      Please, keep in mind this question can help other people with the same (or similar) problems. I searched before asking and I found just one close topic here



      Thanks much.










      share|improve this question














      I'm working with people from psychology and factor analysis is a typical procedure within this area. I have a dataset like the following one



      Original dataset



      I wanna preserve the highest value only each row while transforming all other values in missing cases



      New dataset



      I aware dplyr can solve this problem easily, but I could not find a simple code to perform that.



      Please, check the code below to reproduce this question:



      library(tidyverse)
      set.seed(123)
      ds <- data.frame(x1 = runif(10,min = .1,.29),x2 = runif(10,min = .1,.35), x3 = runif(10,min = .1,.38))
      ds <- ds %>% mutate_all(funs(round(.,3)))

      ds


      Please, keep in mind this question can help other people with the same (or similar) problems. I searched before asking and I found just one close topic here



      Thanks much.







      loops dplyr tidyverse






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 12 '18 at 19:13









      LuisLuis

      30718




      30718






















          2 Answers
          2






          active

          oldest

          votes


















          1














          A very quick answer would be:



          Use the pmax (base) function for row-wise maximum and then mutate_all with the if_else statement to keep or set to missing



          ds %>% 
          #find the row-wise maximum and store it as a column temporarily
          mutate (max = pmax(x1,x2,x3)) %>%
          #loop through all columns and do a check whether the value equals the max
          #If Yes, then leave as is, if not then set to NA
          mutate_all( funs(if_else(. == max,max,NA_real_))) %>%
          #remove the temporary `max` column
          select(-max)

          x1 x2 x3
          1 NA NA 0.349
          2 NA NA 0.294
          3 NA NA 0.279
          4 NA NA 0.378
          5 NA NA 0.284
          6 NA 0.325 NA
          7 NA NA 0.252
          8 0.270 NA NA
          9 0.205 NA NA
          10 NA 0.339 NA





          share|improve this answer























          • Thanks a lot! It was exactly what I looking for!

            – Luis
            Nov 13 '18 at 1:22


















          0














          As this place is so supportive, I decided to answer my own question after reading the @Lefkios-Paikousis answer. In real life, when conducting a Factor Analysis, we have positive results as well as negative ones and we need to maintain the highest value considering its sign.
          As an example, -0.4 is higher than 0.2 and the first value should be kept.



          The following code I built to perform what I want. I hope it helps other people with parallel questions.



           library(tidyverse)
          set.seed(123)
          ds <- data.frame(x1 = runif(10,min = 0.1,0.29),x2 = runif(10,min = 0.1,0.35), x3 = runif(10,min = 0.1,.38))
          ds <- ds %>% mutate_all(funs(round(.,3))) #round
          ds <- ds %>% mutate(x1 = x1*-1) #transform into negative



          ds <- ds %>%
          rowwise() %>% #each row
          mutate(Max.Len = pmax(x1,x2,x3)) %>% #create a var to the highest value
          mutate(Min.Len = pmin(x1,x2,x3)) %>% #create a var to the lowests value
          mutate(keep = if_else(abs(Max.Len)>abs(Min.Len),Max.Len,Min.Len)) %>% #create a var to point out the highest value considering the sign
          mutate_all(funs(if_else(. == keep, keep, NA_real_))) %>% #keep only the highest value mainteining the sign
          select(-c(Max.Len, Min.Len, keep)) #supress other variables


          Raw dataset



          Transformed dataset



          Thanks






          share|improve this answer






















            Your Answer






            StackExchange.ifUsing("editor", function ()
            StackExchange.using("externalEditor", function ()
            StackExchange.using("snippets", function ()
            StackExchange.snippets.init();
            );
            );
            , "code-snippets");

            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "1"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader:
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            ,
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













            draft saved

            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268645%2fusing-tidyverse-to-loop-all-rows-and-identify-and-keep-only-the-higher-value%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1














            A very quick answer would be:



            Use the pmax (base) function for row-wise maximum and then mutate_all with the if_else statement to keep or set to missing



            ds %>% 
            #find the row-wise maximum and store it as a column temporarily
            mutate (max = pmax(x1,x2,x3)) %>%
            #loop through all columns and do a check whether the value equals the max
            #If Yes, then leave as is, if not then set to NA
            mutate_all( funs(if_else(. == max,max,NA_real_))) %>%
            #remove the temporary `max` column
            select(-max)

            x1 x2 x3
            1 NA NA 0.349
            2 NA NA 0.294
            3 NA NA 0.279
            4 NA NA 0.378
            5 NA NA 0.284
            6 NA 0.325 NA
            7 NA NA 0.252
            8 0.270 NA NA
            9 0.205 NA NA
            10 NA 0.339 NA





            share|improve this answer























            • Thanks a lot! It was exactly what I looking for!

              – Luis
              Nov 13 '18 at 1:22















            1














            A very quick answer would be:



            Use the pmax (base) function for row-wise maximum and then mutate_all with the if_else statement to keep or set to missing



            ds %>% 
            #find the row-wise maximum and store it as a column temporarily
            mutate (max = pmax(x1,x2,x3)) %>%
            #loop through all columns and do a check whether the value equals the max
            #If Yes, then leave as is, if not then set to NA
            mutate_all( funs(if_else(. == max,max,NA_real_))) %>%
            #remove the temporary `max` column
            select(-max)

            x1 x2 x3
            1 NA NA 0.349
            2 NA NA 0.294
            3 NA NA 0.279
            4 NA NA 0.378
            5 NA NA 0.284
            6 NA 0.325 NA
            7 NA NA 0.252
            8 0.270 NA NA
            9 0.205 NA NA
            10 NA 0.339 NA





            share|improve this answer























            • Thanks a lot! It was exactly what I looking for!

              – Luis
              Nov 13 '18 at 1:22













            1












            1








            1







            A very quick answer would be:



            Use the pmax (base) function for row-wise maximum and then mutate_all with the if_else statement to keep or set to missing



            ds %>% 
            #find the row-wise maximum and store it as a column temporarily
            mutate (max = pmax(x1,x2,x3)) %>%
            #loop through all columns and do a check whether the value equals the max
            #If Yes, then leave as is, if not then set to NA
            mutate_all( funs(if_else(. == max,max,NA_real_))) %>%
            #remove the temporary `max` column
            select(-max)

            x1 x2 x3
            1 NA NA 0.349
            2 NA NA 0.294
            3 NA NA 0.279
            4 NA NA 0.378
            5 NA NA 0.284
            6 NA 0.325 NA
            7 NA NA 0.252
            8 0.270 NA NA
            9 0.205 NA NA
            10 NA 0.339 NA





            share|improve this answer













            A very quick answer would be:



            Use the pmax (base) function for row-wise maximum and then mutate_all with the if_else statement to keep or set to missing



            ds %>% 
            #find the row-wise maximum and store it as a column temporarily
            mutate (max = pmax(x1,x2,x3)) %>%
            #loop through all columns and do a check whether the value equals the max
            #If Yes, then leave as is, if not then set to NA
            mutate_all( funs(if_else(. == max,max,NA_real_))) %>%
            #remove the temporary `max` column
            select(-max)

            x1 x2 x3
            1 NA NA 0.349
            2 NA NA 0.294
            3 NA NA 0.279
            4 NA NA 0.378
            5 NA NA 0.284
            6 NA 0.325 NA
            7 NA NA 0.252
            8 0.270 NA NA
            9 0.205 NA NA
            10 NA 0.339 NA






            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 12 '18 at 20:06









            Lefkios PaikousisLefkios Paikousis

            837




            837












            • Thanks a lot! It was exactly what I looking for!

              – Luis
              Nov 13 '18 at 1:22

















            • Thanks a lot! It was exactly what I looking for!

              – Luis
              Nov 13 '18 at 1:22
















            Thanks a lot! It was exactly what I looking for!

            – Luis
            Nov 13 '18 at 1:22





            Thanks a lot! It was exactly what I looking for!

            – Luis
            Nov 13 '18 at 1:22













            0














            As this place is so supportive, I decided to answer my own question after reading the @Lefkios-Paikousis answer. In real life, when conducting a Factor Analysis, we have positive results as well as negative ones and we need to maintain the highest value considering its sign.
            As an example, -0.4 is higher than 0.2 and the first value should be kept.



            The following code I built to perform what I want. I hope it helps other people with parallel questions.



             library(tidyverse)
            set.seed(123)
            ds <- data.frame(x1 = runif(10,min = 0.1,0.29),x2 = runif(10,min = 0.1,0.35), x3 = runif(10,min = 0.1,.38))
            ds <- ds %>% mutate_all(funs(round(.,3))) #round
            ds <- ds %>% mutate(x1 = x1*-1) #transform into negative



            ds <- ds %>%
            rowwise() %>% #each row
            mutate(Max.Len = pmax(x1,x2,x3)) %>% #create a var to the highest value
            mutate(Min.Len = pmin(x1,x2,x3)) %>% #create a var to the lowests value
            mutate(keep = if_else(abs(Max.Len)>abs(Min.Len),Max.Len,Min.Len)) %>% #create a var to point out the highest value considering the sign
            mutate_all(funs(if_else(. == keep, keep, NA_real_))) %>% #keep only the highest value mainteining the sign
            select(-c(Max.Len, Min.Len, keep)) #supress other variables


            Raw dataset



            Transformed dataset



            Thanks






            share|improve this answer



























              0














              As this place is so supportive, I decided to answer my own question after reading the @Lefkios-Paikousis answer. In real life, when conducting a Factor Analysis, we have positive results as well as negative ones and we need to maintain the highest value considering its sign.
              As an example, -0.4 is higher than 0.2 and the first value should be kept.



              The following code I built to perform what I want. I hope it helps other people with parallel questions.



               library(tidyverse)
              set.seed(123)
              ds <- data.frame(x1 = runif(10,min = 0.1,0.29),x2 = runif(10,min = 0.1,0.35), x3 = runif(10,min = 0.1,.38))
              ds <- ds %>% mutate_all(funs(round(.,3))) #round
              ds <- ds %>% mutate(x1 = x1*-1) #transform into negative



              ds <- ds %>%
              rowwise() %>% #each row
              mutate(Max.Len = pmax(x1,x2,x3)) %>% #create a var to the highest value
              mutate(Min.Len = pmin(x1,x2,x3)) %>% #create a var to the lowests value
              mutate(keep = if_else(abs(Max.Len)>abs(Min.Len),Max.Len,Min.Len)) %>% #create a var to point out the highest value considering the sign
              mutate_all(funs(if_else(. == keep, keep, NA_real_))) %>% #keep only the highest value mainteining the sign
              select(-c(Max.Len, Min.Len, keep)) #supress other variables


              Raw dataset



              Transformed dataset



              Thanks






              share|improve this answer

























                0












                0








                0







                As this place is so supportive, I decided to answer my own question after reading the @Lefkios-Paikousis answer. In real life, when conducting a Factor Analysis, we have positive results as well as negative ones and we need to maintain the highest value considering its sign.
                As an example, -0.4 is higher than 0.2 and the first value should be kept.



                The following code I built to perform what I want. I hope it helps other people with parallel questions.



                 library(tidyverse)
                set.seed(123)
                ds <- data.frame(x1 = runif(10,min = 0.1,0.29),x2 = runif(10,min = 0.1,0.35), x3 = runif(10,min = 0.1,.38))
                ds <- ds %>% mutate_all(funs(round(.,3))) #round
                ds <- ds %>% mutate(x1 = x1*-1) #transform into negative



                ds <- ds %>%
                rowwise() %>% #each row
                mutate(Max.Len = pmax(x1,x2,x3)) %>% #create a var to the highest value
                mutate(Min.Len = pmin(x1,x2,x3)) %>% #create a var to the lowests value
                mutate(keep = if_else(abs(Max.Len)>abs(Min.Len),Max.Len,Min.Len)) %>% #create a var to point out the highest value considering the sign
                mutate_all(funs(if_else(. == keep, keep, NA_real_))) %>% #keep only the highest value mainteining the sign
                select(-c(Max.Len, Min.Len, keep)) #supress other variables


                Raw dataset



                Transformed dataset



                Thanks






                share|improve this answer













                As this place is so supportive, I decided to answer my own question after reading the @Lefkios-Paikousis answer. In real life, when conducting a Factor Analysis, we have positive results as well as negative ones and we need to maintain the highest value considering its sign.
                As an example, -0.4 is higher than 0.2 and the first value should be kept.



                The following code I built to perform what I want. I hope it helps other people with parallel questions.



                 library(tidyverse)
                set.seed(123)
                ds <- data.frame(x1 = runif(10,min = 0.1,0.29),x2 = runif(10,min = 0.1,0.35), x3 = runif(10,min = 0.1,.38))
                ds <- ds %>% mutate_all(funs(round(.,3))) #round
                ds <- ds %>% mutate(x1 = x1*-1) #transform into negative



                ds <- ds %>%
                rowwise() %>% #each row
                mutate(Max.Len = pmax(x1,x2,x3)) %>% #create a var to the highest value
                mutate(Min.Len = pmin(x1,x2,x3)) %>% #create a var to the lowests value
                mutate(keep = if_else(abs(Max.Len)>abs(Min.Len),Max.Len,Min.Len)) %>% #create a var to point out the highest value considering the sign
                mutate_all(funs(if_else(. == keep, keep, NA_real_))) %>% #keep only the highest value mainteining the sign
                select(-c(Max.Len, Min.Len, keep)) #supress other variables


                Raw dataset



                Transformed dataset



                Thanks







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 14 '18 at 15:33









                LuisLuis

                30718




                30718



























                    draft saved

                    draft discarded
















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid


                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.

                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53268645%2fusing-tidyverse-to-loop-all-rows-and-identify-and-keep-only-the-higher-value%23new-answer', 'question_page');

                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Use pre created SQLite database for Android project in kotlin

                    Darth Vader #20

                    Ondo