Get difference with closest previous row in a group which meets criterion
I'm trying, for each row, to calculate the difference with the closest previous row belonging to the same group which meets a certain criterion.
Suppose I have the following dataframe:
s <- read.table(text = "Visit_num Patient Day Admitted
1 1 2015/01/01 Yes
2 1 2015/01/10 No
3 1 2015/01/15 Yes
4 1 2015/02/10 No
5 1 2015/03/08 Yes
6 2 2015/01/01 Yes
7 2 2015/04/01 No
8 2 2015/04/10 No
9 3 2015/04/01 No
10 3 2015/04/10 No", header = T, sep = "")
For each Visit_num and for each Patient, I'd like to get the difference with the closest row for which the patient was admitted (i.e. Yes). Note column day is ordered by day, and time unit for this example is days.
Here is what I wanted my dataframe to look like:
Visit_num Patient Day Admitted Diff_days
1 1 2015/01/01 Yes NA
2 1 2015/01/10 No 9
3 1 2015/01/15 Yes 14
4 1 2015/02/10 No 26
5 1 2015/03/08 Yes 52
6 2 2015/01/01 Yes NA
7 2 2015/04/01 No 90
8 2 2015/04/10 No 99
9 3 2015/04/01 No NA
10 3 2015/04/10 No NA
Any help is appreciated.
r
add a comment |
I'm trying, for each row, to calculate the difference with the closest previous row belonging to the same group which meets a certain criterion.
Suppose I have the following dataframe:
s <- read.table(text = "Visit_num Patient Day Admitted
1 1 2015/01/01 Yes
2 1 2015/01/10 No
3 1 2015/01/15 Yes
4 1 2015/02/10 No
5 1 2015/03/08 Yes
6 2 2015/01/01 Yes
7 2 2015/04/01 No
8 2 2015/04/10 No
9 3 2015/04/01 No
10 3 2015/04/10 No", header = T, sep = "")
For each Visit_num and for each Patient, I'd like to get the difference with the closest row for which the patient was admitted (i.e. Yes). Note column day is ordered by day, and time unit for this example is days.
Here is what I wanted my dataframe to look like:
Visit_num Patient Day Admitted Diff_days
1 1 2015/01/01 Yes NA
2 1 2015/01/10 No 9
3 1 2015/01/15 Yes 14
4 1 2015/02/10 No 26
5 1 2015/03/08 Yes 52
6 2 2015/01/01 Yes NA
7 2 2015/04/01 No 90
8 2 2015/04/10 No 99
9 3 2015/04/01 No NA
10 3 2015/04/10 No NA
Any help is appreciated.
r
add a comment |
I'm trying, for each row, to calculate the difference with the closest previous row belonging to the same group which meets a certain criterion.
Suppose I have the following dataframe:
s <- read.table(text = "Visit_num Patient Day Admitted
1 1 2015/01/01 Yes
2 1 2015/01/10 No
3 1 2015/01/15 Yes
4 1 2015/02/10 No
5 1 2015/03/08 Yes
6 2 2015/01/01 Yes
7 2 2015/04/01 No
8 2 2015/04/10 No
9 3 2015/04/01 No
10 3 2015/04/10 No", header = T, sep = "")
For each Visit_num and for each Patient, I'd like to get the difference with the closest row for which the patient was admitted (i.e. Yes). Note column day is ordered by day, and time unit for this example is days.
Here is what I wanted my dataframe to look like:
Visit_num Patient Day Admitted Diff_days
1 1 2015/01/01 Yes NA
2 1 2015/01/10 No 9
3 1 2015/01/15 Yes 14
4 1 2015/02/10 No 26
5 1 2015/03/08 Yes 52
6 2 2015/01/01 Yes NA
7 2 2015/04/01 No 90
8 2 2015/04/10 No 99
9 3 2015/04/01 No NA
10 3 2015/04/10 No NA
Any help is appreciated.
r
I'm trying, for each row, to calculate the difference with the closest previous row belonging to the same group which meets a certain criterion.
Suppose I have the following dataframe:
s <- read.table(text = "Visit_num Patient Day Admitted
1 1 2015/01/01 Yes
2 1 2015/01/10 No
3 1 2015/01/15 Yes
4 1 2015/02/10 No
5 1 2015/03/08 Yes
6 2 2015/01/01 Yes
7 2 2015/04/01 No
8 2 2015/04/10 No
9 3 2015/04/01 No
10 3 2015/04/10 No", header = T, sep = "")
For each Visit_num and for each Patient, I'd like to get the difference with the closest row for which the patient was admitted (i.e. Yes). Note column day is ordered by day, and time unit for this example is days.
Here is what I wanted my dataframe to look like:
Visit_num Patient Day Admitted Diff_days
1 1 2015/01/01 Yes NA
2 1 2015/01/10 No 9
3 1 2015/01/15 Yes 14
4 1 2015/02/10 No 26
5 1 2015/03/08 Yes 52
6 2 2015/01/01 Yes NA
7 2 2015/04/01 No 90
8 2 2015/04/10 No 99
9 3 2015/04/01 No NA
10 3 2015/04/10 No NA
Any help is appreciated.
r
r
edited Nov 14 '18 at 20:34
DEYVISON MEDEIROS
asked Nov 14 '18 at 18:44
DEYVISON MEDEIROSDEYVISON MEDEIROS
92
92
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Here is an option with tidyverse
. Convert the 'Day' to Date
class, arrange
by 'Patient', 'Day', grouped by 'Patient' get the difference of adjacent 'Day', create a group 'grp' based on the occurrence of 'Yes' in 'Admitted' and take the cumulative sum of 'Diff_days'
library(tidyverse)
s %>%
mutate(Day = ymd(Day)) %>%
arrange(Patient, Day) %>%
group_by(Patient) %>%
mutate(Diff_days = c(NA, diff(Day))) %>%
group_by(grp = cumsum(lag(Admitted == "Yes", default = TRUE)), add = TRUE) %>%
mutate(Diff_days = cumsum(replace_na(Diff_days, 0))) %>%
ungroup %>%
select(-grp) %>%
mutate(Diff_days = na_if(Diff_days, 0))
# A tibble: 8 x 5
# Visit_num Patient Day Admitted Diff_days
# <int> <int> <date> <fct> <dbl>
#1 1 1 2015-01-01 Yes NA
#2 2 1 2015-01-10 No 9
#3 3 1 2015-01-15 Yes 14
#4 4 1 2015-02-10 No 26
#5 5 1 2015-03-08 Yes 52
#6 6 2 2015-01-01 Yes NA
#7 7 2 2015-04-01 No 90
#8 8 2 2015-04-10 No 99
Sorry, quick follow up. What if instead of Day as date (2015-01-01) I had a datetime object (2015-01-01 00:15)? Would "ymd" and "diff" functions work?
– DEYVISON MEDEIROS
Nov 14 '18 at 19:47
@DEYVISONMEDEIROS If it is already in Datetime, class then you can just domutate(Day = as.Date(Day)) %>%
– akrun
Nov 14 '18 at 19:48
1
Thank you very much for your answer @akrun. That worked!
– DEYVISON MEDEIROS
Nov 14 '18 at 19:58
I found only one problem with the solution: in case the patient was never admitted, I'm still getting a cumulative sum for diff_days. Any way to avoid that (I wanted it to be 0 or NA)? I have modified the example to make it more clear
– DEYVISON MEDEIROS
Nov 14 '18 at 20:36
@DEYVISONMEDEIROS You make add anif/else
condition to this, i.e.group_by(Patient) %>% mutate(Diff_days = if(all(Admitted == "No")) NA else c(NA, diff(Day)))
and make changes in the rest of the code
– akrun
Nov 15 '18 at 2:01
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53306840%2fget-difference-with-closest-previous-row-in-a-group-which-meets-criterion%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Here is an option with tidyverse
. Convert the 'Day' to Date
class, arrange
by 'Patient', 'Day', grouped by 'Patient' get the difference of adjacent 'Day', create a group 'grp' based on the occurrence of 'Yes' in 'Admitted' and take the cumulative sum of 'Diff_days'
library(tidyverse)
s %>%
mutate(Day = ymd(Day)) %>%
arrange(Patient, Day) %>%
group_by(Patient) %>%
mutate(Diff_days = c(NA, diff(Day))) %>%
group_by(grp = cumsum(lag(Admitted == "Yes", default = TRUE)), add = TRUE) %>%
mutate(Diff_days = cumsum(replace_na(Diff_days, 0))) %>%
ungroup %>%
select(-grp) %>%
mutate(Diff_days = na_if(Diff_days, 0))
# A tibble: 8 x 5
# Visit_num Patient Day Admitted Diff_days
# <int> <int> <date> <fct> <dbl>
#1 1 1 2015-01-01 Yes NA
#2 2 1 2015-01-10 No 9
#3 3 1 2015-01-15 Yes 14
#4 4 1 2015-02-10 No 26
#5 5 1 2015-03-08 Yes 52
#6 6 2 2015-01-01 Yes NA
#7 7 2 2015-04-01 No 90
#8 8 2 2015-04-10 No 99
Sorry, quick follow up. What if instead of Day as date (2015-01-01) I had a datetime object (2015-01-01 00:15)? Would "ymd" and "diff" functions work?
– DEYVISON MEDEIROS
Nov 14 '18 at 19:47
@DEYVISONMEDEIROS If it is already in Datetime, class then you can just domutate(Day = as.Date(Day)) %>%
– akrun
Nov 14 '18 at 19:48
1
Thank you very much for your answer @akrun. That worked!
– DEYVISON MEDEIROS
Nov 14 '18 at 19:58
I found only one problem with the solution: in case the patient was never admitted, I'm still getting a cumulative sum for diff_days. Any way to avoid that (I wanted it to be 0 or NA)? I have modified the example to make it more clear
– DEYVISON MEDEIROS
Nov 14 '18 at 20:36
@DEYVISONMEDEIROS You make add anif/else
condition to this, i.e.group_by(Patient) %>% mutate(Diff_days = if(all(Admitted == "No")) NA else c(NA, diff(Day)))
and make changes in the rest of the code
– akrun
Nov 15 '18 at 2:01
add a comment |
Here is an option with tidyverse
. Convert the 'Day' to Date
class, arrange
by 'Patient', 'Day', grouped by 'Patient' get the difference of adjacent 'Day', create a group 'grp' based on the occurrence of 'Yes' in 'Admitted' and take the cumulative sum of 'Diff_days'
library(tidyverse)
s %>%
mutate(Day = ymd(Day)) %>%
arrange(Patient, Day) %>%
group_by(Patient) %>%
mutate(Diff_days = c(NA, diff(Day))) %>%
group_by(grp = cumsum(lag(Admitted == "Yes", default = TRUE)), add = TRUE) %>%
mutate(Diff_days = cumsum(replace_na(Diff_days, 0))) %>%
ungroup %>%
select(-grp) %>%
mutate(Diff_days = na_if(Diff_days, 0))
# A tibble: 8 x 5
# Visit_num Patient Day Admitted Diff_days
# <int> <int> <date> <fct> <dbl>
#1 1 1 2015-01-01 Yes NA
#2 2 1 2015-01-10 No 9
#3 3 1 2015-01-15 Yes 14
#4 4 1 2015-02-10 No 26
#5 5 1 2015-03-08 Yes 52
#6 6 2 2015-01-01 Yes NA
#7 7 2 2015-04-01 No 90
#8 8 2 2015-04-10 No 99
Sorry, quick follow up. What if instead of Day as date (2015-01-01) I had a datetime object (2015-01-01 00:15)? Would "ymd" and "diff" functions work?
– DEYVISON MEDEIROS
Nov 14 '18 at 19:47
@DEYVISONMEDEIROS If it is already in Datetime, class then you can just domutate(Day = as.Date(Day)) %>%
– akrun
Nov 14 '18 at 19:48
1
Thank you very much for your answer @akrun. That worked!
– DEYVISON MEDEIROS
Nov 14 '18 at 19:58
I found only one problem with the solution: in case the patient was never admitted, I'm still getting a cumulative sum for diff_days. Any way to avoid that (I wanted it to be 0 or NA)? I have modified the example to make it more clear
– DEYVISON MEDEIROS
Nov 14 '18 at 20:36
@DEYVISONMEDEIROS You make add anif/else
condition to this, i.e.group_by(Patient) %>% mutate(Diff_days = if(all(Admitted == "No")) NA else c(NA, diff(Day)))
and make changes in the rest of the code
– akrun
Nov 15 '18 at 2:01
add a comment |
Here is an option with tidyverse
. Convert the 'Day' to Date
class, arrange
by 'Patient', 'Day', grouped by 'Patient' get the difference of adjacent 'Day', create a group 'grp' based on the occurrence of 'Yes' in 'Admitted' and take the cumulative sum of 'Diff_days'
library(tidyverse)
s %>%
mutate(Day = ymd(Day)) %>%
arrange(Patient, Day) %>%
group_by(Patient) %>%
mutate(Diff_days = c(NA, diff(Day))) %>%
group_by(grp = cumsum(lag(Admitted == "Yes", default = TRUE)), add = TRUE) %>%
mutate(Diff_days = cumsum(replace_na(Diff_days, 0))) %>%
ungroup %>%
select(-grp) %>%
mutate(Diff_days = na_if(Diff_days, 0))
# A tibble: 8 x 5
# Visit_num Patient Day Admitted Diff_days
# <int> <int> <date> <fct> <dbl>
#1 1 1 2015-01-01 Yes NA
#2 2 1 2015-01-10 No 9
#3 3 1 2015-01-15 Yes 14
#4 4 1 2015-02-10 No 26
#5 5 1 2015-03-08 Yes 52
#6 6 2 2015-01-01 Yes NA
#7 7 2 2015-04-01 No 90
#8 8 2 2015-04-10 No 99
Here is an option with tidyverse
. Convert the 'Day' to Date
class, arrange
by 'Patient', 'Day', grouped by 'Patient' get the difference of adjacent 'Day', create a group 'grp' based on the occurrence of 'Yes' in 'Admitted' and take the cumulative sum of 'Diff_days'
library(tidyverse)
s %>%
mutate(Day = ymd(Day)) %>%
arrange(Patient, Day) %>%
group_by(Patient) %>%
mutate(Diff_days = c(NA, diff(Day))) %>%
group_by(grp = cumsum(lag(Admitted == "Yes", default = TRUE)), add = TRUE) %>%
mutate(Diff_days = cumsum(replace_na(Diff_days, 0))) %>%
ungroup %>%
select(-grp) %>%
mutate(Diff_days = na_if(Diff_days, 0))
# A tibble: 8 x 5
# Visit_num Patient Day Admitted Diff_days
# <int> <int> <date> <fct> <dbl>
#1 1 1 2015-01-01 Yes NA
#2 2 1 2015-01-10 No 9
#3 3 1 2015-01-15 Yes 14
#4 4 1 2015-02-10 No 26
#5 5 1 2015-03-08 Yes 52
#6 6 2 2015-01-01 Yes NA
#7 7 2 2015-04-01 No 90
#8 8 2 2015-04-10 No 99
edited Nov 14 '18 at 19:30
answered Nov 14 '18 at 19:17
akrunakrun
415k13204278
415k13204278
Sorry, quick follow up. What if instead of Day as date (2015-01-01) I had a datetime object (2015-01-01 00:15)? Would "ymd" and "diff" functions work?
– DEYVISON MEDEIROS
Nov 14 '18 at 19:47
@DEYVISONMEDEIROS If it is already in Datetime, class then you can just domutate(Day = as.Date(Day)) %>%
– akrun
Nov 14 '18 at 19:48
1
Thank you very much for your answer @akrun. That worked!
– DEYVISON MEDEIROS
Nov 14 '18 at 19:58
I found only one problem with the solution: in case the patient was never admitted, I'm still getting a cumulative sum for diff_days. Any way to avoid that (I wanted it to be 0 or NA)? I have modified the example to make it more clear
– DEYVISON MEDEIROS
Nov 14 '18 at 20:36
@DEYVISONMEDEIROS You make add anif/else
condition to this, i.e.group_by(Patient) %>% mutate(Diff_days = if(all(Admitted == "No")) NA else c(NA, diff(Day)))
and make changes in the rest of the code
– akrun
Nov 15 '18 at 2:01
add a comment |
Sorry, quick follow up. What if instead of Day as date (2015-01-01) I had a datetime object (2015-01-01 00:15)? Would "ymd" and "diff" functions work?
– DEYVISON MEDEIROS
Nov 14 '18 at 19:47
@DEYVISONMEDEIROS If it is already in Datetime, class then you can just domutate(Day = as.Date(Day)) %>%
– akrun
Nov 14 '18 at 19:48
1
Thank you very much for your answer @akrun. That worked!
– DEYVISON MEDEIROS
Nov 14 '18 at 19:58
I found only one problem with the solution: in case the patient was never admitted, I'm still getting a cumulative sum for diff_days. Any way to avoid that (I wanted it to be 0 or NA)? I have modified the example to make it more clear
– DEYVISON MEDEIROS
Nov 14 '18 at 20:36
@DEYVISONMEDEIROS You make add anif/else
condition to this, i.e.group_by(Patient) %>% mutate(Diff_days = if(all(Admitted == "No")) NA else c(NA, diff(Day)))
and make changes in the rest of the code
– akrun
Nov 15 '18 at 2:01
Sorry, quick follow up. What if instead of Day as date (2015-01-01) I had a datetime object (2015-01-01 00:15)? Would "ymd" and "diff" functions work?
– DEYVISON MEDEIROS
Nov 14 '18 at 19:47
Sorry, quick follow up. What if instead of Day as date (2015-01-01) I had a datetime object (2015-01-01 00:15)? Would "ymd" and "diff" functions work?
– DEYVISON MEDEIROS
Nov 14 '18 at 19:47
@DEYVISONMEDEIROS If it is already in Datetime, class then you can just do
mutate(Day = as.Date(Day)) %>%
– akrun
Nov 14 '18 at 19:48
@DEYVISONMEDEIROS If it is already in Datetime, class then you can just do
mutate(Day = as.Date(Day)) %>%
– akrun
Nov 14 '18 at 19:48
1
1
Thank you very much for your answer @akrun. That worked!
– DEYVISON MEDEIROS
Nov 14 '18 at 19:58
Thank you very much for your answer @akrun. That worked!
– DEYVISON MEDEIROS
Nov 14 '18 at 19:58
I found only one problem with the solution: in case the patient was never admitted, I'm still getting a cumulative sum for diff_days. Any way to avoid that (I wanted it to be 0 or NA)? I have modified the example to make it more clear
– DEYVISON MEDEIROS
Nov 14 '18 at 20:36
I found only one problem with the solution: in case the patient was never admitted, I'm still getting a cumulative sum for diff_days. Any way to avoid that (I wanted it to be 0 or NA)? I have modified the example to make it more clear
– DEYVISON MEDEIROS
Nov 14 '18 at 20:36
@DEYVISONMEDEIROS You make add an
if/else
condition to this, i.e. group_by(Patient) %>% mutate(Diff_days = if(all(Admitted == "No")) NA else c(NA, diff(Day)))
and make changes in the rest of the code– akrun
Nov 15 '18 at 2:01
@DEYVISONMEDEIROS You make add an
if/else
condition to this, i.e. group_by(Patient) %>% mutate(Diff_days = if(all(Admitted == "No")) NA else c(NA, diff(Day)))
and make changes in the rest of the code– akrun
Nov 15 '18 at 2:01
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53306840%2fget-difference-with-closest-previous-row-in-a-group-which-meets-criterion%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown