R replacing missing values with the mean of surroundings values









up vote
3
down vote

favorite
1












My dataset looks like the following (let's call it "a"):



date value
2013-01-01 12.2
2013-01-02 NA
2013-01-03 NA
2013-01-04 16.8
2013-01-05 10.1
2013-01-06 NA
2013-01-07 12.0


I would like to replace the NA by the mean of the closest surroundings values (the previous and the next values in the series).



I tried the following but I am not convinced by the output...



miss.val=which(is.na(a$value))
library(zoo)
z=zoo(a$value,a$date)
z.corr=na.approx(z)
z.corr[(miss.val-1):(miss.val+1),]









share|improve this question





















  • have you thought about Imputation?
    – cianius
    Sep 4 '13 at 11:36














up vote
3
down vote

favorite
1












My dataset looks like the following (let's call it "a"):



date value
2013-01-01 12.2
2013-01-02 NA
2013-01-03 NA
2013-01-04 16.8
2013-01-05 10.1
2013-01-06 NA
2013-01-07 12.0


I would like to replace the NA by the mean of the closest surroundings values (the previous and the next values in the series).



I tried the following but I am not convinced by the output...



miss.val=which(is.na(a$value))
library(zoo)
z=zoo(a$value,a$date)
z.corr=na.approx(z)
z.corr[(miss.val-1):(miss.val+1),]









share|improve this question





















  • have you thought about Imputation?
    – cianius
    Sep 4 '13 at 11:36












up vote
3
down vote

favorite
1









up vote
3
down vote

favorite
1






1





My dataset looks like the following (let's call it "a"):



date value
2013-01-01 12.2
2013-01-02 NA
2013-01-03 NA
2013-01-04 16.8
2013-01-05 10.1
2013-01-06 NA
2013-01-07 12.0


I would like to replace the NA by the mean of the closest surroundings values (the previous and the next values in the series).



I tried the following but I am not convinced by the output...



miss.val=which(is.na(a$value))
library(zoo)
z=zoo(a$value,a$date)
z.corr=na.approx(z)
z.corr[(miss.val-1):(miss.val+1),]









share|improve this question













My dataset looks like the following (let's call it "a"):



date value
2013-01-01 12.2
2013-01-02 NA
2013-01-03 NA
2013-01-04 16.8
2013-01-05 10.1
2013-01-06 NA
2013-01-07 12.0


I would like to replace the NA by the mean of the closest surroundings values (the previous and the next values in the series).



I tried the following but I am not convinced by the output...



miss.val=which(is.na(a$value))
library(zoo)
z=zoo(a$value,a$date)
z.corr=na.approx(z)
z.corr[(miss.val-1):(miss.val+1),]






r time-series zoo






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Sep 4 '13 at 11:34









user2165907

6423719




6423719











  • have you thought about Imputation?
    – cianius
    Sep 4 '13 at 11:36
















  • have you thought about Imputation?
    – cianius
    Sep 4 '13 at 11:36















have you thought about Imputation?
– cianius
Sep 4 '13 at 11:36




have you thought about Imputation?
– cianius
Sep 4 '13 at 11:36












2 Answers
2






active

oldest

votes

















up vote
3
down vote



accepted










Using na.locf (Last Observation Carried Forward) from package zoo:



R> library("zoo")
R> x <- c(12.2, NA, NA, 16.8, 10.1, NA, 12.0)
R> (na.locf(x) + rev(na.locf(rev(x))))/2
[1] 12.20 14.50 14.50 16.80 10.10 11.05 12.00


(does not work if first or last element of x is NA)






share|improve this answer




















  • OK, but I want to change NA's by these values inside the "a" dataset.
    – user2165907
    Sep 4 '13 at 12:01










  • @user2165907 All you have to do is take his final line and redirect it back, i.e. x <- (na.locf(x) + rev(na.locf(rev(x))))/2
    – Carl Witthoft
    Sep 4 '13 at 12:06










  • a$value <- (na.locf(x) + rev(na.locf(rev(x))))/2
    – user2165907
    Sep 4 '13 at 12:19

















up vote
0
down vote













You can do exactly this in 1 line of code with the Moving Average na.ma function of the imputeTS package



library(imputeTS)
na.ma(yourData, k = 1)


This replaces the missing values with the mean of the closest surroundings values.
You can even additionally set parameters.



na.ma(yourData, k =2, weighting = "simple")


In this case the algorithm would take the next 2 values in each direction. You can also choose different weighting of the values(you might want that values closer have more influence)






share|improve this answer




















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f18612715%2fr-replacing-missing-values-with-the-mean-of-surroundings-values%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    3
    down vote



    accepted










    Using na.locf (Last Observation Carried Forward) from package zoo:



    R> library("zoo")
    R> x <- c(12.2, NA, NA, 16.8, 10.1, NA, 12.0)
    R> (na.locf(x) + rev(na.locf(rev(x))))/2
    [1] 12.20 14.50 14.50 16.80 10.10 11.05 12.00


    (does not work if first or last element of x is NA)






    share|improve this answer




















    • OK, but I want to change NA's by these values inside the "a" dataset.
      – user2165907
      Sep 4 '13 at 12:01










    • @user2165907 All you have to do is take his final line and redirect it back, i.e. x <- (na.locf(x) + rev(na.locf(rev(x))))/2
      – Carl Witthoft
      Sep 4 '13 at 12:06










    • a$value <- (na.locf(x) + rev(na.locf(rev(x))))/2
      – user2165907
      Sep 4 '13 at 12:19














    up vote
    3
    down vote



    accepted










    Using na.locf (Last Observation Carried Forward) from package zoo:



    R> library("zoo")
    R> x <- c(12.2, NA, NA, 16.8, 10.1, NA, 12.0)
    R> (na.locf(x) + rev(na.locf(rev(x))))/2
    [1] 12.20 14.50 14.50 16.80 10.10 11.05 12.00


    (does not work if first or last element of x is NA)






    share|improve this answer




















    • OK, but I want to change NA's by these values inside the "a" dataset.
      – user2165907
      Sep 4 '13 at 12:01










    • @user2165907 All you have to do is take his final line and redirect it back, i.e. x <- (na.locf(x) + rev(na.locf(rev(x))))/2
      – Carl Witthoft
      Sep 4 '13 at 12:06










    • a$value <- (na.locf(x) + rev(na.locf(rev(x))))/2
      – user2165907
      Sep 4 '13 at 12:19












    up vote
    3
    down vote



    accepted







    up vote
    3
    down vote



    accepted






    Using na.locf (Last Observation Carried Forward) from package zoo:



    R> library("zoo")
    R> x <- c(12.2, NA, NA, 16.8, 10.1, NA, 12.0)
    R> (na.locf(x) + rev(na.locf(rev(x))))/2
    [1] 12.20 14.50 14.50 16.80 10.10 11.05 12.00


    (does not work if first or last element of x is NA)






    share|improve this answer












    Using na.locf (Last Observation Carried Forward) from package zoo:



    R> library("zoo")
    R> x <- c(12.2, NA, NA, 16.8, 10.1, NA, 12.0)
    R> (na.locf(x) + rev(na.locf(rev(x))))/2
    [1] 12.20 14.50 14.50 16.80 10.10 11.05 12.00


    (does not work if first or last element of x is NA)







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Sep 4 '13 at 11:45









    rcs

    49.4k14146137




    49.4k14146137











    • OK, but I want to change NA's by these values inside the "a" dataset.
      – user2165907
      Sep 4 '13 at 12:01










    • @user2165907 All you have to do is take his final line and redirect it back, i.e. x <- (na.locf(x) + rev(na.locf(rev(x))))/2
      – Carl Witthoft
      Sep 4 '13 at 12:06










    • a$value <- (na.locf(x) + rev(na.locf(rev(x))))/2
      – user2165907
      Sep 4 '13 at 12:19
















    • OK, but I want to change NA's by these values inside the "a" dataset.
      – user2165907
      Sep 4 '13 at 12:01










    • @user2165907 All you have to do is take his final line and redirect it back, i.e. x <- (na.locf(x) + rev(na.locf(rev(x))))/2
      – Carl Witthoft
      Sep 4 '13 at 12:06










    • a$value <- (na.locf(x) + rev(na.locf(rev(x))))/2
      – user2165907
      Sep 4 '13 at 12:19















    OK, but I want to change NA's by these values inside the "a" dataset.
    – user2165907
    Sep 4 '13 at 12:01




    OK, but I want to change NA's by these values inside the "a" dataset.
    – user2165907
    Sep 4 '13 at 12:01












    @user2165907 All you have to do is take his final line and redirect it back, i.e. x <- (na.locf(x) + rev(na.locf(rev(x))))/2
    – Carl Witthoft
    Sep 4 '13 at 12:06




    @user2165907 All you have to do is take his final line and redirect it back, i.e. x <- (na.locf(x) + rev(na.locf(rev(x))))/2
    – Carl Witthoft
    Sep 4 '13 at 12:06












    a$value <- (na.locf(x) + rev(na.locf(rev(x))))/2
    – user2165907
    Sep 4 '13 at 12:19




    a$value <- (na.locf(x) + rev(na.locf(rev(x))))/2
    – user2165907
    Sep 4 '13 at 12:19












    up vote
    0
    down vote













    You can do exactly this in 1 line of code with the Moving Average na.ma function of the imputeTS package



    library(imputeTS)
    na.ma(yourData, k = 1)


    This replaces the missing values with the mean of the closest surroundings values.
    You can even additionally set parameters.



    na.ma(yourData, k =2, weighting = "simple")


    In this case the algorithm would take the next 2 values in each direction. You can also choose different weighting of the values(you might want that values closer have more influence)






    share|improve this answer
























      up vote
      0
      down vote













      You can do exactly this in 1 line of code with the Moving Average na.ma function of the imputeTS package



      library(imputeTS)
      na.ma(yourData, k = 1)


      This replaces the missing values with the mean of the closest surroundings values.
      You can even additionally set parameters.



      na.ma(yourData, k =2, weighting = "simple")


      In this case the algorithm would take the next 2 values in each direction. You can also choose different weighting of the values(you might want that values closer have more influence)






      share|improve this answer






















        up vote
        0
        down vote










        up vote
        0
        down vote









        You can do exactly this in 1 line of code with the Moving Average na.ma function of the imputeTS package



        library(imputeTS)
        na.ma(yourData, k = 1)


        This replaces the missing values with the mean of the closest surroundings values.
        You can even additionally set parameters.



        na.ma(yourData, k =2, weighting = "simple")


        In this case the algorithm would take the next 2 values in each direction. You can also choose different weighting of the values(you might want that values closer have more influence)






        share|improve this answer












        You can do exactly this in 1 line of code with the Moving Average na.ma function of the imputeTS package



        library(imputeTS)
        na.ma(yourData, k = 1)


        This replaces the missing values with the mean of the closest surroundings values.
        You can even additionally set parameters.



        na.ma(yourData, k =2, weighting = "simple")


        In this case the algorithm would take the next 2 values in each direction. You can also choose different weighting of the values(you might want that values closer have more influence)







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 9 at 18:50









        stats0007

        839625




        839625



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f18612715%2fr-replacing-missing-values-with-the-mean-of-surroundings-values%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to how show current date and time by default on contact form 7 in WordPress without taking input from user in datetimepicker

            Syphilis

            Darth Vader #20