Conditional filtering using grepl and relative row position in group
I have a dataset similar to the following:
Journal_ref <- c("1111","2222","2222","2222","3333","3333","4444","4444")
Journal_type <- c("Adj","Rev","Adj","Rev","Rev","Rev","Adj","Adj")
Journal_value <- c(90,10000,12000,80,9000,500,65,2500)
Dataset <- data.frame(Journal_ref,Journal_type,Journal_value)
For each Journal_ref
group I am seeking to filter/select rows based on the following conditions:
- Where "Adj" is included within
Journal_type
, filter/select to return the last "Adj" row in theJournal_ref
group, and - Where "Adj" is not included within
Journal_type
, filter/select to return the last "Rev" in theJournal_ref
group
Based on the example above, the final output required would be:
Journal_ref Journal_type Journal_value
1111 Adj 90
2222 Adj 12000
3333 Rev 500
4444 Adj 2500
I have attempted using various combinations of group_by
, filter
, if
, ifelse
, grepl
, select
and slice
with no success.
Any help would be appreciated, particularly using dplyr
.
r if-statement dplyr slice grepl
add a comment |
I have a dataset similar to the following:
Journal_ref <- c("1111","2222","2222","2222","3333","3333","4444","4444")
Journal_type <- c("Adj","Rev","Adj","Rev","Rev","Rev","Adj","Adj")
Journal_value <- c(90,10000,12000,80,9000,500,65,2500)
Dataset <- data.frame(Journal_ref,Journal_type,Journal_value)
For each Journal_ref
group I am seeking to filter/select rows based on the following conditions:
- Where "Adj" is included within
Journal_type
, filter/select to return the last "Adj" row in theJournal_ref
group, and - Where "Adj" is not included within
Journal_type
, filter/select to return the last "Rev" in theJournal_ref
group
Based on the example above, the final output required would be:
Journal_ref Journal_type Journal_value
1111 Adj 90
2222 Adj 12000
3333 Rev 500
4444 Adj 2500
I have attempted using various combinations of group_by
, filter
, if
, ifelse
, grepl
, select
and slice
with no success.
Any help would be appreciated, particularly using dplyr
.
r if-statement dplyr slice grepl
add a comment |
I have a dataset similar to the following:
Journal_ref <- c("1111","2222","2222","2222","3333","3333","4444","4444")
Journal_type <- c("Adj","Rev","Adj","Rev","Rev","Rev","Adj","Adj")
Journal_value <- c(90,10000,12000,80,9000,500,65,2500)
Dataset <- data.frame(Journal_ref,Journal_type,Journal_value)
For each Journal_ref
group I am seeking to filter/select rows based on the following conditions:
- Where "Adj" is included within
Journal_type
, filter/select to return the last "Adj" row in theJournal_ref
group, and - Where "Adj" is not included within
Journal_type
, filter/select to return the last "Rev" in theJournal_ref
group
Based on the example above, the final output required would be:
Journal_ref Journal_type Journal_value
1111 Adj 90
2222 Adj 12000
3333 Rev 500
4444 Adj 2500
I have attempted using various combinations of group_by
, filter
, if
, ifelse
, grepl
, select
and slice
with no success.
Any help would be appreciated, particularly using dplyr
.
r if-statement dplyr slice grepl
I have a dataset similar to the following:
Journal_ref <- c("1111","2222","2222","2222","3333","3333","4444","4444")
Journal_type <- c("Adj","Rev","Adj","Rev","Rev","Rev","Adj","Adj")
Journal_value <- c(90,10000,12000,80,9000,500,65,2500)
Dataset <- data.frame(Journal_ref,Journal_type,Journal_value)
For each Journal_ref
group I am seeking to filter/select rows based on the following conditions:
- Where "Adj" is included within
Journal_type
, filter/select to return the last "Adj" row in theJournal_ref
group, and - Where "Adj" is not included within
Journal_type
, filter/select to return the last "Rev" in theJournal_ref
group
Based on the example above, the final output required would be:
Journal_ref Journal_type Journal_value
1111 Adj 90
2222 Adj 12000
3333 Rev 500
4444 Adj 2500
I have attempted using various combinations of group_by
, filter
, if
, ifelse
, grepl
, select
and slice
with no success.
Any help would be appreciated, particularly using dplyr
.
r if-statement dplyr slice grepl
r if-statement dplyr slice grepl
edited Nov 11 at 11:45
Jaap
54.9k20117130
54.9k20117130
asked Nov 11 at 11:13
ScottCR1
83
83
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
Try this:
library(dplyr)
Dataset %>%
group_by(Journal_ref, Journal_type) %>%
summarise(Journal_value = last(Journal_value)) %>%
ungroup() %>% group_by(Journal_ref) %>%
filter(!(n() > 1 & Journal_type == "Rev"))
Output:
Journal_ref Journal_type Journal_value
<fct> <fct> <dbl>
1 1111 Adj 90
2 2222 Adj 12000
3 3333 Rev 500
4 4444 Adj 2500
1
Many thanks for this
– ScottCR1
Nov 11 at 11:54
add a comment |
Another possible solution:
Dataset %>%
group_by(Journal_ref) %>%
filter(Journal_type == c("Rev","Adj")[1 + any(Journal_type == "Adj")]) %>%
slice(n())
which gives:
# A tibble: 4 x 3
# Groups: Journal_ref [4]
Journal_ref Journal_type Journal_value
<fct> <fct> <dbl>
1 1111 Adj 90
2 2222 Adj 12000
3 3333 Rev 500
4 4444 Adj 2500
What this does:
- You group by
Journal_ref
- Then you filter
Journal_type
for onlyAdj
when present andRev
when there is noAdj
in a group. Usingc("Rev","Adj")[1 + any(Journal_type == "Adj")]
gives youAdj
when there is at least one present in a group and it givesRev
when there is noAdj
present in a group. - Finally use use
slice(n())
tot select the last row of each group.
You could also do this with if_else
:
Dataset %>%
group_by(Journal_ref) %>%
filter(Journal_type == if_else(any(Journal_type == "Adj"), "Adj", "Rev")) %>%
slice(n())
2
This is brilliant. Many thanks for the explanation also
– ScottCR1
Nov 11 at 11:55
add a comment |
A dplyr
way of doing it is as follows.
library(dplyr)
Dataset %>%
group_by(Journal_ref) %>%
mutate(Adj = any(Journal_type == "Adj"),
i = ifelse(Adj, last(which(Journal_type == "Adj")), last(which(Journal_type == "Rev")))) %>%
filter(row_number() == i) %>%
select(-Adj, -i)
## A tibble: 4 x 3
## Groups: Journal_ref [4]
# Journal_ref Journal_type Journal_value
# <fct> <fct> <dbl>
#1 1111 Adj 90
#2 2222 Adj 12000
#3 3333 Rev 500
#4 4444 Adj 2500
Many thanks for this
– ScottCR1
Nov 11 at 11:55
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248147%2fconditional-filtering-using-grepl-and-relative-row-position-in-group%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try this:
library(dplyr)
Dataset %>%
group_by(Journal_ref, Journal_type) %>%
summarise(Journal_value = last(Journal_value)) %>%
ungroup() %>% group_by(Journal_ref) %>%
filter(!(n() > 1 & Journal_type == "Rev"))
Output:
Journal_ref Journal_type Journal_value
<fct> <fct> <dbl>
1 1111 Adj 90
2 2222 Adj 12000
3 3333 Rev 500
4 4444 Adj 2500
1
Many thanks for this
– ScottCR1
Nov 11 at 11:54
add a comment |
Try this:
library(dplyr)
Dataset %>%
group_by(Journal_ref, Journal_type) %>%
summarise(Journal_value = last(Journal_value)) %>%
ungroup() %>% group_by(Journal_ref) %>%
filter(!(n() > 1 & Journal_type == "Rev"))
Output:
Journal_ref Journal_type Journal_value
<fct> <fct> <dbl>
1 1111 Adj 90
2 2222 Adj 12000
3 3333 Rev 500
4 4444 Adj 2500
1
Many thanks for this
– ScottCR1
Nov 11 at 11:54
add a comment |
Try this:
library(dplyr)
Dataset %>%
group_by(Journal_ref, Journal_type) %>%
summarise(Journal_value = last(Journal_value)) %>%
ungroup() %>% group_by(Journal_ref) %>%
filter(!(n() > 1 & Journal_type == "Rev"))
Output:
Journal_ref Journal_type Journal_value
<fct> <fct> <dbl>
1 1111 Adj 90
2 2222 Adj 12000
3 3333 Rev 500
4 4444 Adj 2500
Try this:
library(dplyr)
Dataset %>%
group_by(Journal_ref, Journal_type) %>%
summarise(Journal_value = last(Journal_value)) %>%
ungroup() %>% group_by(Journal_ref) %>%
filter(!(n() > 1 & Journal_type == "Rev"))
Output:
Journal_ref Journal_type Journal_value
<fct> <fct> <dbl>
1 1111 Adj 90
2 2222 Adj 12000
3 3333 Rev 500
4 4444 Adj 2500
answered Nov 11 at 11:29
arg0naut
1,975313
1,975313
1
Many thanks for this
– ScottCR1
Nov 11 at 11:54
add a comment |
1
Many thanks for this
– ScottCR1
Nov 11 at 11:54
1
1
Many thanks for this
– ScottCR1
Nov 11 at 11:54
Many thanks for this
– ScottCR1
Nov 11 at 11:54
add a comment |
Another possible solution:
Dataset %>%
group_by(Journal_ref) %>%
filter(Journal_type == c("Rev","Adj")[1 + any(Journal_type == "Adj")]) %>%
slice(n())
which gives:
# A tibble: 4 x 3
# Groups: Journal_ref [4]
Journal_ref Journal_type Journal_value
<fct> <fct> <dbl>
1 1111 Adj 90
2 2222 Adj 12000
3 3333 Rev 500
4 4444 Adj 2500
What this does:
- You group by
Journal_ref
- Then you filter
Journal_type
for onlyAdj
when present andRev
when there is noAdj
in a group. Usingc("Rev","Adj")[1 + any(Journal_type == "Adj")]
gives youAdj
when there is at least one present in a group and it givesRev
when there is noAdj
present in a group. - Finally use use
slice(n())
tot select the last row of each group.
You could also do this with if_else
:
Dataset %>%
group_by(Journal_ref) %>%
filter(Journal_type == if_else(any(Journal_type == "Adj"), "Adj", "Rev")) %>%
slice(n())
2
This is brilliant. Many thanks for the explanation also
– ScottCR1
Nov 11 at 11:55
add a comment |
Another possible solution:
Dataset %>%
group_by(Journal_ref) %>%
filter(Journal_type == c("Rev","Adj")[1 + any(Journal_type == "Adj")]) %>%
slice(n())
which gives:
# A tibble: 4 x 3
# Groups: Journal_ref [4]
Journal_ref Journal_type Journal_value
<fct> <fct> <dbl>
1 1111 Adj 90
2 2222 Adj 12000
3 3333 Rev 500
4 4444 Adj 2500
What this does:
- You group by
Journal_ref
- Then you filter
Journal_type
for onlyAdj
when present andRev
when there is noAdj
in a group. Usingc("Rev","Adj")[1 + any(Journal_type == "Adj")]
gives youAdj
when there is at least one present in a group and it givesRev
when there is noAdj
present in a group. - Finally use use
slice(n())
tot select the last row of each group.
You could also do this with if_else
:
Dataset %>%
group_by(Journal_ref) %>%
filter(Journal_type == if_else(any(Journal_type == "Adj"), "Adj", "Rev")) %>%
slice(n())
2
This is brilliant. Many thanks for the explanation also
– ScottCR1
Nov 11 at 11:55
add a comment |
Another possible solution:
Dataset %>%
group_by(Journal_ref) %>%
filter(Journal_type == c("Rev","Adj")[1 + any(Journal_type == "Adj")]) %>%
slice(n())
which gives:
# A tibble: 4 x 3
# Groups: Journal_ref [4]
Journal_ref Journal_type Journal_value
<fct> <fct> <dbl>
1 1111 Adj 90
2 2222 Adj 12000
3 3333 Rev 500
4 4444 Adj 2500
What this does:
- You group by
Journal_ref
- Then you filter
Journal_type
for onlyAdj
when present andRev
when there is noAdj
in a group. Usingc("Rev","Adj")[1 + any(Journal_type == "Adj")]
gives youAdj
when there is at least one present in a group and it givesRev
when there is noAdj
present in a group. - Finally use use
slice(n())
tot select the last row of each group.
You could also do this with if_else
:
Dataset %>%
group_by(Journal_ref) %>%
filter(Journal_type == if_else(any(Journal_type == "Adj"), "Adj", "Rev")) %>%
slice(n())
Another possible solution:
Dataset %>%
group_by(Journal_ref) %>%
filter(Journal_type == c("Rev","Adj")[1 + any(Journal_type == "Adj")]) %>%
slice(n())
which gives:
# A tibble: 4 x 3
# Groups: Journal_ref [4]
Journal_ref Journal_type Journal_value
<fct> <fct> <dbl>
1 1111 Adj 90
2 2222 Adj 12000
3 3333 Rev 500
4 4444 Adj 2500
What this does:
- You group by
Journal_ref
- Then you filter
Journal_type
for onlyAdj
when present andRev
when there is noAdj
in a group. Usingc("Rev","Adj")[1 + any(Journal_type == "Adj")]
gives youAdj
when there is at least one present in a group and it givesRev
when there is noAdj
present in a group. - Finally use use
slice(n())
tot select the last row of each group.
You could also do this with if_else
:
Dataset %>%
group_by(Journal_ref) %>%
filter(Journal_type == if_else(any(Journal_type == "Adj"), "Adj", "Rev")) %>%
slice(n())
edited Nov 11 at 11:43
answered Nov 11 at 11:32
Jaap
54.9k20117130
54.9k20117130
2
This is brilliant. Many thanks for the explanation also
– ScottCR1
Nov 11 at 11:55
add a comment |
2
This is brilliant. Many thanks for the explanation also
– ScottCR1
Nov 11 at 11:55
2
2
This is brilliant. Many thanks for the explanation also
– ScottCR1
Nov 11 at 11:55
This is brilliant. Many thanks for the explanation also
– ScottCR1
Nov 11 at 11:55
add a comment |
A dplyr
way of doing it is as follows.
library(dplyr)
Dataset %>%
group_by(Journal_ref) %>%
mutate(Adj = any(Journal_type == "Adj"),
i = ifelse(Adj, last(which(Journal_type == "Adj")), last(which(Journal_type == "Rev")))) %>%
filter(row_number() == i) %>%
select(-Adj, -i)
## A tibble: 4 x 3
## Groups: Journal_ref [4]
# Journal_ref Journal_type Journal_value
# <fct> <fct> <dbl>
#1 1111 Adj 90
#2 2222 Adj 12000
#3 3333 Rev 500
#4 4444 Adj 2500
Many thanks for this
– ScottCR1
Nov 11 at 11:55
add a comment |
A dplyr
way of doing it is as follows.
library(dplyr)
Dataset %>%
group_by(Journal_ref) %>%
mutate(Adj = any(Journal_type == "Adj"),
i = ifelse(Adj, last(which(Journal_type == "Adj")), last(which(Journal_type == "Rev")))) %>%
filter(row_number() == i) %>%
select(-Adj, -i)
## A tibble: 4 x 3
## Groups: Journal_ref [4]
# Journal_ref Journal_type Journal_value
# <fct> <fct> <dbl>
#1 1111 Adj 90
#2 2222 Adj 12000
#3 3333 Rev 500
#4 4444 Adj 2500
Many thanks for this
– ScottCR1
Nov 11 at 11:55
add a comment |
A dplyr
way of doing it is as follows.
library(dplyr)
Dataset %>%
group_by(Journal_ref) %>%
mutate(Adj = any(Journal_type == "Adj"),
i = ifelse(Adj, last(which(Journal_type == "Adj")), last(which(Journal_type == "Rev")))) %>%
filter(row_number() == i) %>%
select(-Adj, -i)
## A tibble: 4 x 3
## Groups: Journal_ref [4]
# Journal_ref Journal_type Journal_value
# <fct> <fct> <dbl>
#1 1111 Adj 90
#2 2222 Adj 12000
#3 3333 Rev 500
#4 4444 Adj 2500
A dplyr
way of doing it is as follows.
library(dplyr)
Dataset %>%
group_by(Journal_ref) %>%
mutate(Adj = any(Journal_type == "Adj"),
i = ifelse(Adj, last(which(Journal_type == "Adj")), last(which(Journal_type == "Rev")))) %>%
filter(row_number() == i) %>%
select(-Adj, -i)
## A tibble: 4 x 3
## Groups: Journal_ref [4]
# Journal_ref Journal_type Journal_value
# <fct> <fct> <dbl>
#1 1111 Adj 90
#2 2222 Adj 12000
#3 3333 Rev 500
#4 4444 Adj 2500
answered Nov 11 at 11:31
Rui Barradas
15.9k41730
15.9k41730
Many thanks for this
– ScottCR1
Nov 11 at 11:55
add a comment |
Many thanks for this
– ScottCR1
Nov 11 at 11:55
Many thanks for this
– ScottCR1
Nov 11 at 11:55
Many thanks for this
– ScottCR1
Nov 11 at 11:55
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53248147%2fconditional-filtering-using-grepl-and-relative-row-position-in-group%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown