r calculating rolling average with window based on value (not number of rows or date/time variable)
I'm quite new to all the packages meant for calculating rolling averages in R and I hope you can show me in the right direction.
I have the following data as an example:
ms <- c(300, 300, 300, 301, 303, 305, 305, 306, 308, 310, 310, 311, 312,
314, 315, 315, 316, 316, 316, 317, 318, 320, 320, 321, 322, 324,
328, 329, 330, 330, 330, 332, 332, 334, 334, 335, 335, 336, 336,
337, 338, 338, 338, 340, 340, 341, 342, 342, 342, 342)
correct <- c(1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0,
1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1,
1, 0, 0, 1, 0, 0, 1, 1, 0, 0)
library(dplyr)
df <- as.data.frame(data_frame(ms, correct))
"ms" are time points in milliseconds and "correct" is whether a specific action is performed correctly (1= correct, 0=not correct).
My goal now is that I'd like to calculate the percentage correct (or average) over windows of a set number of milliseconds. As you can see, certain time points are missing and certain time points occur multiple times. I, therefore, do not want to do a filter based on row number. I've looked into some packages such as "tidyquant" but it seems to me that these kind of packages need a time/date variable instead of a numerical variable to determine the window over which values are averaged. Is there a way to specify the window on the numerical value of df$ms?
Many thanks!
r filter smoothing rolling-computation rolling-average
add a comment |
I'm quite new to all the packages meant for calculating rolling averages in R and I hope you can show me in the right direction.
I have the following data as an example:
ms <- c(300, 300, 300, 301, 303, 305, 305, 306, 308, 310, 310, 311, 312,
314, 315, 315, 316, 316, 316, 317, 318, 320, 320, 321, 322, 324,
328, 329, 330, 330, 330, 332, 332, 334, 334, 335, 335, 336, 336,
337, 338, 338, 338, 340, 340, 341, 342, 342, 342, 342)
correct <- c(1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0,
1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1,
1, 0, 0, 1, 0, 0, 1, 1, 0, 0)
library(dplyr)
df <- as.data.frame(data_frame(ms, correct))
"ms" are time points in milliseconds and "correct" is whether a specific action is performed correctly (1= correct, 0=not correct).
My goal now is that I'd like to calculate the percentage correct (or average) over windows of a set number of milliseconds. As you can see, certain time points are missing and certain time points occur multiple times. I, therefore, do not want to do a filter based on row number. I've looked into some packages such as "tidyquant" but it seems to me that these kind of packages need a time/date variable instead of a numerical variable to determine the window over which values are averaged. Is there a way to specify the window on the numerical value of df$ms?
Many thanks!
r filter smoothing rolling-computation rolling-average
add a comment |
I'm quite new to all the packages meant for calculating rolling averages in R and I hope you can show me in the right direction.
I have the following data as an example:
ms <- c(300, 300, 300, 301, 303, 305, 305, 306, 308, 310, 310, 311, 312,
314, 315, 315, 316, 316, 316, 317, 318, 320, 320, 321, 322, 324,
328, 329, 330, 330, 330, 332, 332, 334, 334, 335, 335, 336, 336,
337, 338, 338, 338, 340, 340, 341, 342, 342, 342, 342)
correct <- c(1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0,
1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1,
1, 0, 0, 1, 0, 0, 1, 1, 0, 0)
library(dplyr)
df <- as.data.frame(data_frame(ms, correct))
"ms" are time points in milliseconds and "correct" is whether a specific action is performed correctly (1= correct, 0=not correct).
My goal now is that I'd like to calculate the percentage correct (or average) over windows of a set number of milliseconds. As you can see, certain time points are missing and certain time points occur multiple times. I, therefore, do not want to do a filter based on row number. I've looked into some packages such as "tidyquant" but it seems to me that these kind of packages need a time/date variable instead of a numerical variable to determine the window over which values are averaged. Is there a way to specify the window on the numerical value of df$ms?
Many thanks!
r filter smoothing rolling-computation rolling-average
I'm quite new to all the packages meant for calculating rolling averages in R and I hope you can show me in the right direction.
I have the following data as an example:
ms <- c(300, 300, 300, 301, 303, 305, 305, 306, 308, 310, 310, 311, 312,
314, 315, 315, 316, 316, 316, 317, 318, 320, 320, 321, 322, 324,
328, 329, 330, 330, 330, 332, 332, 334, 334, 335, 335, 336, 336,
337, 338, 338, 338, 340, 340, 341, 342, 342, 342, 342)
correct <- c(1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0,
1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1,
1, 0, 0, 1, 0, 0, 1, 1, 0, 0)
library(dplyr)
df <- as.data.frame(data_frame(ms, correct))
"ms" are time points in milliseconds and "correct" is whether a specific action is performed correctly (1= correct, 0=not correct).
My goal now is that I'd like to calculate the percentage correct (or average) over windows of a set number of milliseconds. As you can see, certain time points are missing and certain time points occur multiple times. I, therefore, do not want to do a filter based on row number. I've looked into some packages such as "tidyquant" but it seems to me that these kind of packages need a time/date variable instead of a numerical variable to determine the window over which values are averaged. Is there a way to specify the window on the numerical value of df$ms?
Many thanks!
r filter smoothing rolling-computation rolling-average
r filter smoothing rolling-computation rolling-average
edited Feb 8 at 7:25
zx8754
30k763100
30k763100
asked Nov 13 '18 at 20:51
RmyjuloRRmyjuloR
6517
6517
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
Try out:
library(dplyr)
# count the number of values per ms
df <- df %>%
group_by(ms) %>%
mutate(Nb.values = n())
# consider a window of 1 ms and compute the percentage for each window
df2 <- setNames(aggregate(correct ~ factor(df$ms, levels = as.character(seq(min(df$ms), max(df$ms), 1))),
df, sum),
c("ms", "Count.correct"))
# complete data frame (including unused levels)
df2 <- tidyr::complete(df2, ms)
df2$ms <- as.numeric(levels(df2$ms))[df2$ms]
df2 <- df2 %>% left_join(distinct(df[, c(1, 3)]), "ms")
# compute a rolling mean of the percentage of correct, with a width of 5
df2 %>%
mutate(Window = paste(ms, ms+4, sep = "-"), # add windows
Rolling.correct = zoo::rollapply(Count.correct, 5, sum, na.rm = T,
partial = TRUE, fill = NA, align = "left") /
zoo::rollapply(Nb.values, 5, sum, na.rm = T, partial = TRUE,
fill = NA, align = "left")) # add rolling mean
# A tibble: 43 x 5
ms Count.correct Nb.values Window Rolling.correct
<dbl> <dbl> <int> <chr> <dbl>
1 300 2 3 300-304 0.40
2 301 0 1 301-305 0.00
3 302 NA NA 302-306 0.25
4 303 0 1 303-307 0.25
5 304 NA NA 304-308 0.25
6 305 0 2 305-309 0.25
7 306 1 1 306-310 0.25
8 307 NA NA 307-311 0.00
9 308 0 1 308-312 0.20
10 309 NA NA 309-313 0.25
# ... with 33 more rows
This looks neat! Is this also possible with a sliding window? So, windows that go 300-304, 301-305, 302-306 etc?
– RmyjuloR
Nov 13 '18 at 22:21
Humm, in this case it should relevant to start with one value perms
and then use the window when computing the average. I edited my answer
– ANG
Nov 13 '18 at 23:43
This is coming close to what I want, but if you look at the original data the values for the first 3 windows should be: 300-304 --> 2/5values = 0.4; 301-305 --> 0/4values = 0; 302-306 --> 1/4values = 0.25
– RmyjuloR
Nov 14 '18 at 1:08
Ah ok, this means that we have to also consider the number of values in each window. See my edit
– ANG
Nov 14 '18 at 10:49
Thank you for doing the math! I guess my brain was a bit cooked after exploring all the different packages :$
– RmyjuloR
Nov 14 '18 at 13:42
add a comment |
You can try 'cut'. For example, if you want to divide ms such that you have 5 groups overall then you can do:
df$ms_factor <- cut(df$ms, 5)
df_new <- df %>% group_by(ms_factor) %>% summarise(mean = mean(correct))
I'd actually like a rolling average for a predefined window. For example a window of 5 ms: an average for the window 300-304, 301-305, 302-306, etc. Runnig till the max value of ms.
– RmyjuloR
Nov 13 '18 at 21:53
In that case you can try something like this: df$ms_factor <- cut(df$ms, seq(300, 345, by = 5))
– pooja p
Nov 13 '18 at 22:26
This gives me the windows 300-305, 305-310, 310-315, etc., right? Could the code be altered to be calculating for a sliding window, so: 300-304, 301-305, 302-306, etc?
– RmyjuloR
Nov 14 '18 at 1:34
add a comment |
This could be done with base R
:
calculate_irregular_ratio <- function(df, time_var = "ms", window_var = 5, calc_var = "correct")
sapply(df[[time_var]], function(x) round(mean(df[[calc_var]][df[[time_var]] >= (x - window_var) & df[[time_var]] <= x]), 2))
You can apply it as follows (the default is set to 5 ms, you can change it with changing the window_var
parameter):
df$window_5_ratio <- calculate_irregular_ratio(df, window_var = 5)
In your case, you would get (first 10 rows shown only):
ms correct window_5_ratio
1 300 1 0.67
2 300 1 0.67
3 300 0 0.67
4 301 0 0.50
5 303 0 0.40
6 305 0 0.29
7 305 0 0.29
8 306 1 0.20
9 308 0 0.20
10 310 0 0.17
It behaves like a rolling mean, however it does not rely on rows. Instead, it takes the window based on values in a column.
For instance, at rows 6 and 7, it takes the value of current row (305 ms), and calculates the ratio on all the values in dataframe that are 305 and - 5, i.e. between 305 and 300, yielding 0.29.
You can of course always modify the function yourself, e.g. if you'd like window 5 to actually mean 301 - 305 and not 300 - 305, you can set + 1 after x - window_var
, etc.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53289315%2fr-calculating-rolling-average-with-window-based-on-value-not-number-of-rows-or%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try out:
library(dplyr)
# count the number of values per ms
df <- df %>%
group_by(ms) %>%
mutate(Nb.values = n())
# consider a window of 1 ms and compute the percentage for each window
df2 <- setNames(aggregate(correct ~ factor(df$ms, levels = as.character(seq(min(df$ms), max(df$ms), 1))),
df, sum),
c("ms", "Count.correct"))
# complete data frame (including unused levels)
df2 <- tidyr::complete(df2, ms)
df2$ms <- as.numeric(levels(df2$ms))[df2$ms]
df2 <- df2 %>% left_join(distinct(df[, c(1, 3)]), "ms")
# compute a rolling mean of the percentage of correct, with a width of 5
df2 %>%
mutate(Window = paste(ms, ms+4, sep = "-"), # add windows
Rolling.correct = zoo::rollapply(Count.correct, 5, sum, na.rm = T,
partial = TRUE, fill = NA, align = "left") /
zoo::rollapply(Nb.values, 5, sum, na.rm = T, partial = TRUE,
fill = NA, align = "left")) # add rolling mean
# A tibble: 43 x 5
ms Count.correct Nb.values Window Rolling.correct
<dbl> <dbl> <int> <chr> <dbl>
1 300 2 3 300-304 0.40
2 301 0 1 301-305 0.00
3 302 NA NA 302-306 0.25
4 303 0 1 303-307 0.25
5 304 NA NA 304-308 0.25
6 305 0 2 305-309 0.25
7 306 1 1 306-310 0.25
8 307 NA NA 307-311 0.00
9 308 0 1 308-312 0.20
10 309 NA NA 309-313 0.25
# ... with 33 more rows
This looks neat! Is this also possible with a sliding window? So, windows that go 300-304, 301-305, 302-306 etc?
– RmyjuloR
Nov 13 '18 at 22:21
Humm, in this case it should relevant to start with one value perms
and then use the window when computing the average. I edited my answer
– ANG
Nov 13 '18 at 23:43
This is coming close to what I want, but if you look at the original data the values for the first 3 windows should be: 300-304 --> 2/5values = 0.4; 301-305 --> 0/4values = 0; 302-306 --> 1/4values = 0.25
– RmyjuloR
Nov 14 '18 at 1:08
Ah ok, this means that we have to also consider the number of values in each window. See my edit
– ANG
Nov 14 '18 at 10:49
Thank you for doing the math! I guess my brain was a bit cooked after exploring all the different packages :$
– RmyjuloR
Nov 14 '18 at 13:42
add a comment |
Try out:
library(dplyr)
# count the number of values per ms
df <- df %>%
group_by(ms) %>%
mutate(Nb.values = n())
# consider a window of 1 ms and compute the percentage for each window
df2 <- setNames(aggregate(correct ~ factor(df$ms, levels = as.character(seq(min(df$ms), max(df$ms), 1))),
df, sum),
c("ms", "Count.correct"))
# complete data frame (including unused levels)
df2 <- tidyr::complete(df2, ms)
df2$ms <- as.numeric(levels(df2$ms))[df2$ms]
df2 <- df2 %>% left_join(distinct(df[, c(1, 3)]), "ms")
# compute a rolling mean of the percentage of correct, with a width of 5
df2 %>%
mutate(Window = paste(ms, ms+4, sep = "-"), # add windows
Rolling.correct = zoo::rollapply(Count.correct, 5, sum, na.rm = T,
partial = TRUE, fill = NA, align = "left") /
zoo::rollapply(Nb.values, 5, sum, na.rm = T, partial = TRUE,
fill = NA, align = "left")) # add rolling mean
# A tibble: 43 x 5
ms Count.correct Nb.values Window Rolling.correct
<dbl> <dbl> <int> <chr> <dbl>
1 300 2 3 300-304 0.40
2 301 0 1 301-305 0.00
3 302 NA NA 302-306 0.25
4 303 0 1 303-307 0.25
5 304 NA NA 304-308 0.25
6 305 0 2 305-309 0.25
7 306 1 1 306-310 0.25
8 307 NA NA 307-311 0.00
9 308 0 1 308-312 0.20
10 309 NA NA 309-313 0.25
# ... with 33 more rows
This looks neat! Is this also possible with a sliding window? So, windows that go 300-304, 301-305, 302-306 etc?
– RmyjuloR
Nov 13 '18 at 22:21
Humm, in this case it should relevant to start with one value perms
and then use the window when computing the average. I edited my answer
– ANG
Nov 13 '18 at 23:43
This is coming close to what I want, but if you look at the original data the values for the first 3 windows should be: 300-304 --> 2/5values = 0.4; 301-305 --> 0/4values = 0; 302-306 --> 1/4values = 0.25
– RmyjuloR
Nov 14 '18 at 1:08
Ah ok, this means that we have to also consider the number of values in each window. See my edit
– ANG
Nov 14 '18 at 10:49
Thank you for doing the math! I guess my brain was a bit cooked after exploring all the different packages :$
– RmyjuloR
Nov 14 '18 at 13:42
add a comment |
Try out:
library(dplyr)
# count the number of values per ms
df <- df %>%
group_by(ms) %>%
mutate(Nb.values = n())
# consider a window of 1 ms and compute the percentage for each window
df2 <- setNames(aggregate(correct ~ factor(df$ms, levels = as.character(seq(min(df$ms), max(df$ms), 1))),
df, sum),
c("ms", "Count.correct"))
# complete data frame (including unused levels)
df2 <- tidyr::complete(df2, ms)
df2$ms <- as.numeric(levels(df2$ms))[df2$ms]
df2 <- df2 %>% left_join(distinct(df[, c(1, 3)]), "ms")
# compute a rolling mean of the percentage of correct, with a width of 5
df2 %>%
mutate(Window = paste(ms, ms+4, sep = "-"), # add windows
Rolling.correct = zoo::rollapply(Count.correct, 5, sum, na.rm = T,
partial = TRUE, fill = NA, align = "left") /
zoo::rollapply(Nb.values, 5, sum, na.rm = T, partial = TRUE,
fill = NA, align = "left")) # add rolling mean
# A tibble: 43 x 5
ms Count.correct Nb.values Window Rolling.correct
<dbl> <dbl> <int> <chr> <dbl>
1 300 2 3 300-304 0.40
2 301 0 1 301-305 0.00
3 302 NA NA 302-306 0.25
4 303 0 1 303-307 0.25
5 304 NA NA 304-308 0.25
6 305 0 2 305-309 0.25
7 306 1 1 306-310 0.25
8 307 NA NA 307-311 0.00
9 308 0 1 308-312 0.20
10 309 NA NA 309-313 0.25
# ... with 33 more rows
Try out:
library(dplyr)
# count the number of values per ms
df <- df %>%
group_by(ms) %>%
mutate(Nb.values = n())
# consider a window of 1 ms and compute the percentage for each window
df2 <- setNames(aggregate(correct ~ factor(df$ms, levels = as.character(seq(min(df$ms), max(df$ms), 1))),
df, sum),
c("ms", "Count.correct"))
# complete data frame (including unused levels)
df2 <- tidyr::complete(df2, ms)
df2$ms <- as.numeric(levels(df2$ms))[df2$ms]
df2 <- df2 %>% left_join(distinct(df[, c(1, 3)]), "ms")
# compute a rolling mean of the percentage of correct, with a width of 5
df2 %>%
mutate(Window = paste(ms, ms+4, sep = "-"), # add windows
Rolling.correct = zoo::rollapply(Count.correct, 5, sum, na.rm = T,
partial = TRUE, fill = NA, align = "left") /
zoo::rollapply(Nb.values, 5, sum, na.rm = T, partial = TRUE,
fill = NA, align = "left")) # add rolling mean
# A tibble: 43 x 5
ms Count.correct Nb.values Window Rolling.correct
<dbl> <dbl> <int> <chr> <dbl>
1 300 2 3 300-304 0.40
2 301 0 1 301-305 0.00
3 302 NA NA 302-306 0.25
4 303 0 1 303-307 0.25
5 304 NA NA 304-308 0.25
6 305 0 2 305-309 0.25
7 306 1 1 306-310 0.25
8 307 NA NA 307-311 0.00
9 308 0 1 308-312 0.20
10 309 NA NA 309-313 0.25
# ... with 33 more rows
edited Nov 14 '18 at 12:00
answered Nov 13 '18 at 21:54
ANGANG
4,5012820
4,5012820
This looks neat! Is this also possible with a sliding window? So, windows that go 300-304, 301-305, 302-306 etc?
– RmyjuloR
Nov 13 '18 at 22:21
Humm, in this case it should relevant to start with one value perms
and then use the window when computing the average. I edited my answer
– ANG
Nov 13 '18 at 23:43
This is coming close to what I want, but if you look at the original data the values for the first 3 windows should be: 300-304 --> 2/5values = 0.4; 301-305 --> 0/4values = 0; 302-306 --> 1/4values = 0.25
– RmyjuloR
Nov 14 '18 at 1:08
Ah ok, this means that we have to also consider the number of values in each window. See my edit
– ANG
Nov 14 '18 at 10:49
Thank you for doing the math! I guess my brain was a bit cooked after exploring all the different packages :$
– RmyjuloR
Nov 14 '18 at 13:42
add a comment |
This looks neat! Is this also possible with a sliding window? So, windows that go 300-304, 301-305, 302-306 etc?
– RmyjuloR
Nov 13 '18 at 22:21
Humm, in this case it should relevant to start with one value perms
and then use the window when computing the average. I edited my answer
– ANG
Nov 13 '18 at 23:43
This is coming close to what I want, but if you look at the original data the values for the first 3 windows should be: 300-304 --> 2/5values = 0.4; 301-305 --> 0/4values = 0; 302-306 --> 1/4values = 0.25
– RmyjuloR
Nov 14 '18 at 1:08
Ah ok, this means that we have to also consider the number of values in each window. See my edit
– ANG
Nov 14 '18 at 10:49
Thank you for doing the math! I guess my brain was a bit cooked after exploring all the different packages :$
– RmyjuloR
Nov 14 '18 at 13:42
This looks neat! Is this also possible with a sliding window? So, windows that go 300-304, 301-305, 302-306 etc?
– RmyjuloR
Nov 13 '18 at 22:21
This looks neat! Is this also possible with a sliding window? So, windows that go 300-304, 301-305, 302-306 etc?
– RmyjuloR
Nov 13 '18 at 22:21
Humm, in this case it should relevant to start with one value per
ms
and then use the window when computing the average. I edited my answer– ANG
Nov 13 '18 at 23:43
Humm, in this case it should relevant to start with one value per
ms
and then use the window when computing the average. I edited my answer– ANG
Nov 13 '18 at 23:43
This is coming close to what I want, but if you look at the original data the values for the first 3 windows should be: 300-304 --> 2/5values = 0.4; 301-305 --> 0/4values = 0; 302-306 --> 1/4values = 0.25
– RmyjuloR
Nov 14 '18 at 1:08
This is coming close to what I want, but if you look at the original data the values for the first 3 windows should be: 300-304 --> 2/5values = 0.4; 301-305 --> 0/4values = 0; 302-306 --> 1/4values = 0.25
– RmyjuloR
Nov 14 '18 at 1:08
Ah ok, this means that we have to also consider the number of values in each window. See my edit
– ANG
Nov 14 '18 at 10:49
Ah ok, this means that we have to also consider the number of values in each window. See my edit
– ANG
Nov 14 '18 at 10:49
Thank you for doing the math! I guess my brain was a bit cooked after exploring all the different packages :$
– RmyjuloR
Nov 14 '18 at 13:42
Thank you for doing the math! I guess my brain was a bit cooked after exploring all the different packages :$
– RmyjuloR
Nov 14 '18 at 13:42
add a comment |
You can try 'cut'. For example, if you want to divide ms such that you have 5 groups overall then you can do:
df$ms_factor <- cut(df$ms, 5)
df_new <- df %>% group_by(ms_factor) %>% summarise(mean = mean(correct))
I'd actually like a rolling average for a predefined window. For example a window of 5 ms: an average for the window 300-304, 301-305, 302-306, etc. Runnig till the max value of ms.
– RmyjuloR
Nov 13 '18 at 21:53
In that case you can try something like this: df$ms_factor <- cut(df$ms, seq(300, 345, by = 5))
– pooja p
Nov 13 '18 at 22:26
This gives me the windows 300-305, 305-310, 310-315, etc., right? Could the code be altered to be calculating for a sliding window, so: 300-304, 301-305, 302-306, etc?
– RmyjuloR
Nov 14 '18 at 1:34
add a comment |
You can try 'cut'. For example, if you want to divide ms such that you have 5 groups overall then you can do:
df$ms_factor <- cut(df$ms, 5)
df_new <- df %>% group_by(ms_factor) %>% summarise(mean = mean(correct))
I'd actually like a rolling average for a predefined window. For example a window of 5 ms: an average for the window 300-304, 301-305, 302-306, etc. Runnig till the max value of ms.
– RmyjuloR
Nov 13 '18 at 21:53
In that case you can try something like this: df$ms_factor <- cut(df$ms, seq(300, 345, by = 5))
– pooja p
Nov 13 '18 at 22:26
This gives me the windows 300-305, 305-310, 310-315, etc., right? Could the code be altered to be calculating for a sliding window, so: 300-304, 301-305, 302-306, etc?
– RmyjuloR
Nov 14 '18 at 1:34
add a comment |
You can try 'cut'. For example, if you want to divide ms such that you have 5 groups overall then you can do:
df$ms_factor <- cut(df$ms, 5)
df_new <- df %>% group_by(ms_factor) %>% summarise(mean = mean(correct))
You can try 'cut'. For example, if you want to divide ms such that you have 5 groups overall then you can do:
df$ms_factor <- cut(df$ms, 5)
df_new <- df %>% group_by(ms_factor) %>% summarise(mean = mean(correct))
answered Nov 13 '18 at 21:03
pooja ppooja p
1297
1297
I'd actually like a rolling average for a predefined window. For example a window of 5 ms: an average for the window 300-304, 301-305, 302-306, etc. Runnig till the max value of ms.
– RmyjuloR
Nov 13 '18 at 21:53
In that case you can try something like this: df$ms_factor <- cut(df$ms, seq(300, 345, by = 5))
– pooja p
Nov 13 '18 at 22:26
This gives me the windows 300-305, 305-310, 310-315, etc., right? Could the code be altered to be calculating for a sliding window, so: 300-304, 301-305, 302-306, etc?
– RmyjuloR
Nov 14 '18 at 1:34
add a comment |
I'd actually like a rolling average for a predefined window. For example a window of 5 ms: an average for the window 300-304, 301-305, 302-306, etc. Runnig till the max value of ms.
– RmyjuloR
Nov 13 '18 at 21:53
In that case you can try something like this: df$ms_factor <- cut(df$ms, seq(300, 345, by = 5))
– pooja p
Nov 13 '18 at 22:26
This gives me the windows 300-305, 305-310, 310-315, etc., right? Could the code be altered to be calculating for a sliding window, so: 300-304, 301-305, 302-306, etc?
– RmyjuloR
Nov 14 '18 at 1:34
I'd actually like a rolling average for a predefined window. For example a window of 5 ms: an average for the window 300-304, 301-305, 302-306, etc. Runnig till the max value of ms.
– RmyjuloR
Nov 13 '18 at 21:53
I'd actually like a rolling average for a predefined window. For example a window of 5 ms: an average for the window 300-304, 301-305, 302-306, etc. Runnig till the max value of ms.
– RmyjuloR
Nov 13 '18 at 21:53
In that case you can try something like this: df$ms_factor <- cut(df$ms, seq(300, 345, by = 5))
– pooja p
Nov 13 '18 at 22:26
In that case you can try something like this: df$ms_factor <- cut(df$ms, seq(300, 345, by = 5))
– pooja p
Nov 13 '18 at 22:26
This gives me the windows 300-305, 305-310, 310-315, etc., right? Could the code be altered to be calculating for a sliding window, so: 300-304, 301-305, 302-306, etc?
– RmyjuloR
Nov 14 '18 at 1:34
This gives me the windows 300-305, 305-310, 310-315, etc., right? Could the code be altered to be calculating for a sliding window, so: 300-304, 301-305, 302-306, etc?
– RmyjuloR
Nov 14 '18 at 1:34
add a comment |
This could be done with base R
:
calculate_irregular_ratio <- function(df, time_var = "ms", window_var = 5, calc_var = "correct")
sapply(df[[time_var]], function(x) round(mean(df[[calc_var]][df[[time_var]] >= (x - window_var) & df[[time_var]] <= x]), 2))
You can apply it as follows (the default is set to 5 ms, you can change it with changing the window_var
parameter):
df$window_5_ratio <- calculate_irregular_ratio(df, window_var = 5)
In your case, you would get (first 10 rows shown only):
ms correct window_5_ratio
1 300 1 0.67
2 300 1 0.67
3 300 0 0.67
4 301 0 0.50
5 303 0 0.40
6 305 0 0.29
7 305 0 0.29
8 306 1 0.20
9 308 0 0.20
10 310 0 0.17
It behaves like a rolling mean, however it does not rely on rows. Instead, it takes the window based on values in a column.
For instance, at rows 6 and 7, it takes the value of current row (305 ms), and calculates the ratio on all the values in dataframe that are 305 and - 5, i.e. between 305 and 300, yielding 0.29.
You can of course always modify the function yourself, e.g. if you'd like window 5 to actually mean 301 - 305 and not 300 - 305, you can set + 1 after x - window_var
, etc.
add a comment |
This could be done with base R
:
calculate_irregular_ratio <- function(df, time_var = "ms", window_var = 5, calc_var = "correct")
sapply(df[[time_var]], function(x) round(mean(df[[calc_var]][df[[time_var]] >= (x - window_var) & df[[time_var]] <= x]), 2))
You can apply it as follows (the default is set to 5 ms, you can change it with changing the window_var
parameter):
df$window_5_ratio <- calculate_irregular_ratio(df, window_var = 5)
In your case, you would get (first 10 rows shown only):
ms correct window_5_ratio
1 300 1 0.67
2 300 1 0.67
3 300 0 0.67
4 301 0 0.50
5 303 0 0.40
6 305 0 0.29
7 305 0 0.29
8 306 1 0.20
9 308 0 0.20
10 310 0 0.17
It behaves like a rolling mean, however it does not rely on rows. Instead, it takes the window based on values in a column.
For instance, at rows 6 and 7, it takes the value of current row (305 ms), and calculates the ratio on all the values in dataframe that are 305 and - 5, i.e. between 305 and 300, yielding 0.29.
You can of course always modify the function yourself, e.g. if you'd like window 5 to actually mean 301 - 305 and not 300 - 305, you can set + 1 after x - window_var
, etc.
add a comment |
This could be done with base R
:
calculate_irregular_ratio <- function(df, time_var = "ms", window_var = 5, calc_var = "correct")
sapply(df[[time_var]], function(x) round(mean(df[[calc_var]][df[[time_var]] >= (x - window_var) & df[[time_var]] <= x]), 2))
You can apply it as follows (the default is set to 5 ms, you can change it with changing the window_var
parameter):
df$window_5_ratio <- calculate_irregular_ratio(df, window_var = 5)
In your case, you would get (first 10 rows shown only):
ms correct window_5_ratio
1 300 1 0.67
2 300 1 0.67
3 300 0 0.67
4 301 0 0.50
5 303 0 0.40
6 305 0 0.29
7 305 0 0.29
8 306 1 0.20
9 308 0 0.20
10 310 0 0.17
It behaves like a rolling mean, however it does not rely on rows. Instead, it takes the window based on values in a column.
For instance, at rows 6 and 7, it takes the value of current row (305 ms), and calculates the ratio on all the values in dataframe that are 305 and - 5, i.e. between 305 and 300, yielding 0.29.
You can of course always modify the function yourself, e.g. if you'd like window 5 to actually mean 301 - 305 and not 300 - 305, you can set + 1 after x - window_var
, etc.
This could be done with base R
:
calculate_irregular_ratio <- function(df, time_var = "ms", window_var = 5, calc_var = "correct")
sapply(df[[time_var]], function(x) round(mean(df[[calc_var]][df[[time_var]] >= (x - window_var) & df[[time_var]] <= x]), 2))
You can apply it as follows (the default is set to 5 ms, you can change it with changing the window_var
parameter):
df$window_5_ratio <- calculate_irregular_ratio(df, window_var = 5)
In your case, you would get (first 10 rows shown only):
ms correct window_5_ratio
1 300 1 0.67
2 300 1 0.67
3 300 0 0.67
4 301 0 0.50
5 303 0 0.40
6 305 0 0.29
7 305 0 0.29
8 306 1 0.20
9 308 0 0.20
10 310 0 0.17
It behaves like a rolling mean, however it does not rely on rows. Instead, it takes the window based on values in a column.
For instance, at rows 6 and 7, it takes the value of current row (305 ms), and calculates the ratio on all the values in dataframe that are 305 and - 5, i.e. between 305 and 300, yielding 0.29.
You can of course always modify the function yourself, e.g. if you'd like window 5 to actually mean 301 - 305 and not 300 - 305, you can set + 1 after x - window_var
, etc.
edited Nov 13 '18 at 22:27
answered Nov 13 '18 at 22:20
arg0nautarg0naut
4,0191315
4,0191315
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53289315%2fr-calculating-rolling-average-with-window-based-on-value-not-number-of-rows-or%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown