Conditional filtering using grepl and relative row position in group










1














I have a dataset similar to the following:



Journal_ref <- c("1111","2222","2222","2222","3333","3333","4444","4444")
Journal_type <- c("Adj","Rev","Adj","Rev","Rev","Rev","Adj","Adj")
Journal_value <- c(90,10000,12000,80,9000,500,65,2500)
Dataset <- data.frame(Journal_ref,Journal_type,Journal_value)


For each Journal_ref group I am seeking to filter/select rows based on the following conditions:



  • Where "Adj" is included within Journal_type, filter/select to return the last "Adj" row in the Journal_ref group, and

  • Where "Adj" is not included within Journal_type, filter/select to return the last "Rev" in the Journal_ref group

Based on the example above, the final output required would be:



Journal_ref Journal_type Journal_value
1111 Adj 90
2222 Adj 12000
3333 Rev 500
4444 Adj 2500


I have attempted using various combinations of group_by, filter, if, ifelse, grepl, select and slice with no success.



Any help would be appreciated, particularly using dplyr.










share|improve this question




























    1














    I have a dataset similar to the following:



    Journal_ref <- c("1111","2222","2222","2222","3333","3333","4444","4444")
    Journal_type <- c("Adj","Rev","Adj","Rev","Rev","Rev","Adj","Adj")
    Journal_value <- c(90,10000,12000,80,9000,500,65,2500)
    Dataset <- data.frame(Journal_ref,Journal_type,Journal_value)


    For each Journal_ref group I am seeking to filter/select rows based on the following conditions:



    • Where "Adj" is included within Journal_type, filter/select to return the last "Adj" row in the Journal_ref group, and

    • Where "Adj" is not included within Journal_type, filter/select to return the last "Rev" in the Journal_ref group

    Based on the example above, the final output required would be:



    Journal_ref Journal_type Journal_value
    1111 Adj 90
    2222 Adj 12000
    3333 Rev 500
    4444 Adj 2500


    I have attempted using various combinations of group_by, filter, if, ifelse, grepl, select and slice with no success.



    Any help would be appreciated, particularly using dplyr.










    share|improve this question


























      1












      1








      1







      I have a dataset similar to the following:



      Journal_ref <- c("1111","2222","2222","2222","3333","3333","4444","4444")
      Journal_type <- c("Adj","Rev","Adj","Rev","Rev","Rev","Adj","Adj")
      Journal_value <- c(90,10000,12000,80,9000,500,65,2500)
      Dataset <- data.frame(Journal_ref,Journal_type,Journal_value)


      For each Journal_ref group I am seeking to filter/select rows based on the following conditions:



      • Where "Adj" is included within Journal_type, filter/select to return the last "Adj" row in the Journal_ref group, and

      • Where "Adj" is not included within Journal_type, filter/select to return the last "Rev" in the Journal_ref group

      Based on the example above, the final output required would be:



      Journal_ref Journal_type Journal_value
      1111 Adj 90
      2222 Adj 12000
      3333 Rev 500
      4444 Adj 2500


      I have attempted using various combinations of group_by, filter, if, ifelse, grepl, select and slice with no success.



      Any help would be appreciated, particularly using dplyr.










      share|improve this question















      I have a dataset similar to the following:



      Journal_ref <- c("1111","2222","2222","2222","3333","3333","4444","4444")
      Journal_type <- c("Adj","Rev","Adj","Rev","Rev","Rev","Adj","Adj")
      Journal_value <- c(90,10000,12000,80,9000,500,65,2500)
      Dataset <- data.frame(Journal_ref,Journal_type,Journal_value)


      For each Journal_ref group I am seeking to filter/select rows based on the following conditions:



      • Where "Adj" is included within Journal_type, filter/select to return the last "Adj" row in the Journal_ref group, and

      • Where "Adj" is not included within Journal_type, filter/select to return the last "Rev" in the Journal_ref group

      Based on the example above, the final output required would be:



      Journal_ref Journal_type Journal_value
      1111 Adj 90
      2222 Adj 12000
      3333 Rev 500
      4444 Adj 2500


      I have attempted using various combinations of group_by, filter, if, ifelse, grepl, select and slice with no success.



      Any help would be appreciated, particularly using dplyr.







      r if-statement dplyr slice grepl






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 11 at 11:45









      Jaap

      54.9k20117130




      54.9k20117130










      asked Nov 11 at 11:13









      ScottCR1

      83




      83






















          3 Answers
          3






          active

          oldest

          votes


















          1














          Try this:



          library(dplyr)

          Dataset %>%
          group_by(Journal_ref, Journal_type) %>%
          summarise(Journal_value = last(Journal_value)) %>%
          ungroup() %>% group_by(Journal_ref) %>%
          filter(!(n() > 1 & Journal_type == "Rev"))


          Output:



           Journal_ref Journal_type Journal_value
          <fct> <fct> <dbl>
          1 1111 Adj 90
          2 2222 Adj 12000
          3 3333 Rev 500
          4 4444 Adj 2500





          share|improve this answer
















          • 1




            Many thanks for this
            – ScottCR1
            Nov 11 at 11:54


















          3














          Another possible solution:



          Dataset %>% 
          group_by(Journal_ref) %>%
          filter(Journal_type == c("Rev","Adj")[1 + any(Journal_type == "Adj")]) %>%
          slice(n())


          which gives:




          # A tibble: 4 x 3
          # Groups: Journal_ref [4]
          Journal_ref Journal_type Journal_value
          <fct> <fct> <dbl>
          1 1111 Adj 90
          2 2222 Adj 12000
          3 3333 Rev 500
          4 4444 Adj 2500



          What this does:



          • You group by Journal_ref

          • Then you filter Journal_type for only Adj when present and Rev when there is no Adj in a group. Using c("Rev","Adj")[1 + any(Journal_type == "Adj")] gives you Adj when there is at least one present in a group and it gives Rev when there is no Adj present in a group.

          • Finally use use slice(n()) tot select the last row of each group.


          You could also do this with if_else:



          Dataset %>% 
          group_by(Journal_ref) %>%
          filter(Journal_type == if_else(any(Journal_type == "Adj"), "Adj", "Rev")) %>%
          slice(n())





          share|improve this answer


















          • 2




            This is brilliant. Many thanks for the explanation also
            – ScottCR1
            Nov 11 at 11:55


















          0














          A dplyr way of doing it is as follows.



          library(dplyr)

          Dataset %>%
          group_by(Journal_ref) %>%
          mutate(Adj = any(Journal_type == "Adj"),
          i = ifelse(Adj, last(which(Journal_type == "Adj")), last(which(Journal_type == "Rev")))) %>%
          filter(row_number() == i) %>%
          select(-Adj, -i)
          ## A tibble: 4 x 3
          ## Groups: Journal_ref [4]
          # Journal_ref Journal_type Journal_value
          # <fct> <fct> <dbl>
          #1 1111 Adj 90
          #2 2222 Adj 12000
          #3 3333 Rev 500
          #4 4444 Adj 2500





          share|improve this answer




















          • Many thanks for this
            – ScottCR1
            Nov 11 at 11:55










          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248147%2fconditional-filtering-using-grepl-and-relative-row-position-in-group%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          3 Answers
          3






          active

          oldest

          votes








          3 Answers
          3






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          Try this:



          library(dplyr)

          Dataset %>%
          group_by(Journal_ref, Journal_type) %>%
          summarise(Journal_value = last(Journal_value)) %>%
          ungroup() %>% group_by(Journal_ref) %>%
          filter(!(n() > 1 & Journal_type == "Rev"))


          Output:



           Journal_ref Journal_type Journal_value
          <fct> <fct> <dbl>
          1 1111 Adj 90
          2 2222 Adj 12000
          3 3333 Rev 500
          4 4444 Adj 2500





          share|improve this answer
















          • 1




            Many thanks for this
            – ScottCR1
            Nov 11 at 11:54















          1














          Try this:



          library(dplyr)

          Dataset %>%
          group_by(Journal_ref, Journal_type) %>%
          summarise(Journal_value = last(Journal_value)) %>%
          ungroup() %>% group_by(Journal_ref) %>%
          filter(!(n() > 1 & Journal_type == "Rev"))


          Output:



           Journal_ref Journal_type Journal_value
          <fct> <fct> <dbl>
          1 1111 Adj 90
          2 2222 Adj 12000
          3 3333 Rev 500
          4 4444 Adj 2500





          share|improve this answer
















          • 1




            Many thanks for this
            – ScottCR1
            Nov 11 at 11:54













          1












          1








          1






          Try this:



          library(dplyr)

          Dataset %>%
          group_by(Journal_ref, Journal_type) %>%
          summarise(Journal_value = last(Journal_value)) %>%
          ungroup() %>% group_by(Journal_ref) %>%
          filter(!(n() > 1 & Journal_type == "Rev"))


          Output:



           Journal_ref Journal_type Journal_value
          <fct> <fct> <dbl>
          1 1111 Adj 90
          2 2222 Adj 12000
          3 3333 Rev 500
          4 4444 Adj 2500





          share|improve this answer












          Try this:



          library(dplyr)

          Dataset %>%
          group_by(Journal_ref, Journal_type) %>%
          summarise(Journal_value = last(Journal_value)) %>%
          ungroup() %>% group_by(Journal_ref) %>%
          filter(!(n() > 1 & Journal_type == "Rev"))


          Output:



           Journal_ref Journal_type Journal_value
          <fct> <fct> <dbl>
          1 1111 Adj 90
          2 2222 Adj 12000
          3 3333 Rev 500
          4 4444 Adj 2500






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 11 at 11:29









          arg0naut

          1,975313




          1,975313







          • 1




            Many thanks for this
            – ScottCR1
            Nov 11 at 11:54












          • 1




            Many thanks for this
            – ScottCR1
            Nov 11 at 11:54







          1




          1




          Many thanks for this
          – ScottCR1
          Nov 11 at 11:54




          Many thanks for this
          – ScottCR1
          Nov 11 at 11:54













          3














          Another possible solution:



          Dataset %>% 
          group_by(Journal_ref) %>%
          filter(Journal_type == c("Rev","Adj")[1 + any(Journal_type == "Adj")]) %>%
          slice(n())


          which gives:




          # A tibble: 4 x 3
          # Groups: Journal_ref [4]
          Journal_ref Journal_type Journal_value
          <fct> <fct> <dbl>
          1 1111 Adj 90
          2 2222 Adj 12000
          3 3333 Rev 500
          4 4444 Adj 2500



          What this does:



          • You group by Journal_ref

          • Then you filter Journal_type for only Adj when present and Rev when there is no Adj in a group. Using c("Rev","Adj")[1 + any(Journal_type == "Adj")] gives you Adj when there is at least one present in a group and it gives Rev when there is no Adj present in a group.

          • Finally use use slice(n()) tot select the last row of each group.


          You could also do this with if_else:



          Dataset %>% 
          group_by(Journal_ref) %>%
          filter(Journal_type == if_else(any(Journal_type == "Adj"), "Adj", "Rev")) %>%
          slice(n())





          share|improve this answer


















          • 2




            This is brilliant. Many thanks for the explanation also
            – ScottCR1
            Nov 11 at 11:55















          3














          Another possible solution:



          Dataset %>% 
          group_by(Journal_ref) %>%
          filter(Journal_type == c("Rev","Adj")[1 + any(Journal_type == "Adj")]) %>%
          slice(n())


          which gives:




          # A tibble: 4 x 3
          # Groups: Journal_ref [4]
          Journal_ref Journal_type Journal_value
          <fct> <fct> <dbl>
          1 1111 Adj 90
          2 2222 Adj 12000
          3 3333 Rev 500
          4 4444 Adj 2500



          What this does:



          • You group by Journal_ref

          • Then you filter Journal_type for only Adj when present and Rev when there is no Adj in a group. Using c("Rev","Adj")[1 + any(Journal_type == "Adj")] gives you Adj when there is at least one present in a group and it gives Rev when there is no Adj present in a group.

          • Finally use use slice(n()) tot select the last row of each group.


          You could also do this with if_else:



          Dataset %>% 
          group_by(Journal_ref) %>%
          filter(Journal_type == if_else(any(Journal_type == "Adj"), "Adj", "Rev")) %>%
          slice(n())





          share|improve this answer


















          • 2




            This is brilliant. Many thanks for the explanation also
            – ScottCR1
            Nov 11 at 11:55













          3












          3








          3






          Another possible solution:



          Dataset %>% 
          group_by(Journal_ref) %>%
          filter(Journal_type == c("Rev","Adj")[1 + any(Journal_type == "Adj")]) %>%
          slice(n())


          which gives:




          # A tibble: 4 x 3
          # Groups: Journal_ref [4]
          Journal_ref Journal_type Journal_value
          <fct> <fct> <dbl>
          1 1111 Adj 90
          2 2222 Adj 12000
          3 3333 Rev 500
          4 4444 Adj 2500



          What this does:



          • You group by Journal_ref

          • Then you filter Journal_type for only Adj when present and Rev when there is no Adj in a group. Using c("Rev","Adj")[1 + any(Journal_type == "Adj")] gives you Adj when there is at least one present in a group and it gives Rev when there is no Adj present in a group.

          • Finally use use slice(n()) tot select the last row of each group.


          You could also do this with if_else:



          Dataset %>% 
          group_by(Journal_ref) %>%
          filter(Journal_type == if_else(any(Journal_type == "Adj"), "Adj", "Rev")) %>%
          slice(n())





          share|improve this answer














          Another possible solution:



          Dataset %>% 
          group_by(Journal_ref) %>%
          filter(Journal_type == c("Rev","Adj")[1 + any(Journal_type == "Adj")]) %>%
          slice(n())


          which gives:




          # A tibble: 4 x 3
          # Groups: Journal_ref [4]
          Journal_ref Journal_type Journal_value
          <fct> <fct> <dbl>
          1 1111 Adj 90
          2 2222 Adj 12000
          3 3333 Rev 500
          4 4444 Adj 2500



          What this does:



          • You group by Journal_ref

          • Then you filter Journal_type for only Adj when present and Rev when there is no Adj in a group. Using c("Rev","Adj")[1 + any(Journal_type == "Adj")] gives you Adj when there is at least one present in a group and it gives Rev when there is no Adj present in a group.

          • Finally use use slice(n()) tot select the last row of each group.


          You could also do this with if_else:



          Dataset %>% 
          group_by(Journal_ref) %>%
          filter(Journal_type == if_else(any(Journal_type == "Adj"), "Adj", "Rev")) %>%
          slice(n())






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Nov 11 at 11:43

























          answered Nov 11 at 11:32









          Jaap

          54.9k20117130




          54.9k20117130







          • 2




            This is brilliant. Many thanks for the explanation also
            – ScottCR1
            Nov 11 at 11:55












          • 2




            This is brilliant. Many thanks for the explanation also
            – ScottCR1
            Nov 11 at 11:55







          2




          2




          This is brilliant. Many thanks for the explanation also
          – ScottCR1
          Nov 11 at 11:55




          This is brilliant. Many thanks for the explanation also
          – ScottCR1
          Nov 11 at 11:55











          0














          A dplyr way of doing it is as follows.



          library(dplyr)

          Dataset %>%
          group_by(Journal_ref) %>%
          mutate(Adj = any(Journal_type == "Adj"),
          i = ifelse(Adj, last(which(Journal_type == "Adj")), last(which(Journal_type == "Rev")))) %>%
          filter(row_number() == i) %>%
          select(-Adj, -i)
          ## A tibble: 4 x 3
          ## Groups: Journal_ref [4]
          # Journal_ref Journal_type Journal_value
          # <fct> <fct> <dbl>
          #1 1111 Adj 90
          #2 2222 Adj 12000
          #3 3333 Rev 500
          #4 4444 Adj 2500





          share|improve this answer




















          • Many thanks for this
            – ScottCR1
            Nov 11 at 11:55















          0














          A dplyr way of doing it is as follows.



          library(dplyr)

          Dataset %>%
          group_by(Journal_ref) %>%
          mutate(Adj = any(Journal_type == "Adj"),
          i = ifelse(Adj, last(which(Journal_type == "Adj")), last(which(Journal_type == "Rev")))) %>%
          filter(row_number() == i) %>%
          select(-Adj, -i)
          ## A tibble: 4 x 3
          ## Groups: Journal_ref [4]
          # Journal_ref Journal_type Journal_value
          # <fct> <fct> <dbl>
          #1 1111 Adj 90
          #2 2222 Adj 12000
          #3 3333 Rev 500
          #4 4444 Adj 2500





          share|improve this answer




















          • Many thanks for this
            – ScottCR1
            Nov 11 at 11:55













          0












          0








          0






          A dplyr way of doing it is as follows.



          library(dplyr)

          Dataset %>%
          group_by(Journal_ref) %>%
          mutate(Adj = any(Journal_type == "Adj"),
          i = ifelse(Adj, last(which(Journal_type == "Adj")), last(which(Journal_type == "Rev")))) %>%
          filter(row_number() == i) %>%
          select(-Adj, -i)
          ## A tibble: 4 x 3
          ## Groups: Journal_ref [4]
          # Journal_ref Journal_type Journal_value
          # <fct> <fct> <dbl>
          #1 1111 Adj 90
          #2 2222 Adj 12000
          #3 3333 Rev 500
          #4 4444 Adj 2500





          share|improve this answer












          A dplyr way of doing it is as follows.



          library(dplyr)

          Dataset %>%
          group_by(Journal_ref) %>%
          mutate(Adj = any(Journal_type == "Adj"),
          i = ifelse(Adj, last(which(Journal_type == "Adj")), last(which(Journal_type == "Rev")))) %>%
          filter(row_number() == i) %>%
          select(-Adj, -i)
          ## A tibble: 4 x 3
          ## Groups: Journal_ref [4]
          # Journal_ref Journal_type Journal_value
          # <fct> <fct> <dbl>
          #1 1111 Adj 90
          #2 2222 Adj 12000
          #3 3333 Rev 500
          #4 4444 Adj 2500






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 11 at 11:31









          Rui Barradas

          15.9k41730




          15.9k41730











          • Many thanks for this
            – ScottCR1
            Nov 11 at 11:55
















          • Many thanks for this
            – ScottCR1
            Nov 11 at 11:55















          Many thanks for this
          – ScottCR1
          Nov 11 at 11:55




          Many thanks for this
          – ScottCR1
          Nov 11 at 11:55

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248147%2fconditional-filtering-using-grepl-and-relative-row-position-in-group%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          How to how show current date and time by default on contact form 7 in WordPress without taking input from user in datetimepicker

          Syphilis

          Darth Vader #20