Data Cleaning(Flagging) Dead Sensor



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








2















I have a large timeseries(pandas dataframe) of windspeed (10min average) which contains error data (dead sensor). How can it be flagged automatically. I was trying with moving average.
Some other approach other then moving average is much appreciated. I have attached the sample data image below.



enter image description here










share|improve this question

















  • 1





    You did not attach the data. Also share the work you have done (aka code)

    – Jorge
    Nov 15 '18 at 11:35

















2















I have a large timeseries(pandas dataframe) of windspeed (10min average) which contains error data (dead sensor). How can it be flagged automatically. I was trying with moving average.
Some other approach other then moving average is much appreciated. I have attached the sample data image below.



enter image description here










share|improve this question

















  • 1





    You did not attach the data. Also share the work you have done (aka code)

    – Jorge
    Nov 15 '18 at 11:35













2












2








2








I have a large timeseries(pandas dataframe) of windspeed (10min average) which contains error data (dead sensor). How can it be flagged automatically. I was trying with moving average.
Some other approach other then moving average is much appreciated. I have attached the sample data image below.



enter image description here










share|improve this question














I have a large timeseries(pandas dataframe) of windspeed (10min average) which contains error data (dead sensor). How can it be flagged automatically. I was trying with moving average.
Some other approach other then moving average is much appreciated. I have attached the sample data image below.



enter image description here







python pandas






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 15 '18 at 11:22









Bhuvan KumarBhuvan Kumar

219212




219212







  • 1





    You did not attach the data. Also share the work you have done (aka code)

    – Jorge
    Nov 15 '18 at 11:35












  • 1





    You did not attach the data. Also share the work you have done (aka code)

    – Jorge
    Nov 15 '18 at 11:35







1




1





You did not attach the data. Also share the work you have done (aka code)

– Jorge
Nov 15 '18 at 11:35





You did not attach the data. Also share the work you have done (aka code)

– Jorge
Nov 15 '18 at 11:35












1 Answer
1






active

oldest

votes


















1














There are several ways to deal with this problem. I will first pass to differences:



%matplotlib inline
import pandas as pd
import numpy as np

np.random.seed(0)
n = 200
y = np.cumsum(np.random.randn(n))

y[100:120] = 2
y[150:160] = 0

ts = pd.Series(y)
ts.diff().plot();


enter image description here



The next step is to find how long are the strikes of consecutive zeros.



def getZeroStrikeLen(x):
""" Accept a boolean array only
"""
res = np.diff(np.where(np.concatenate(([x[0]],
x[:-1] != x[1:],
[True])))[0])[::2]
return res

vec = ts.diff().values == 0
out = getZeroStrikeLen(vec)


Now if len(out)>0 you can conclude that there is a problem. If you want to go one step further you can have a look to this. It is in R but it's not that hard to replicate in Python.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53318379%2fdata-cleaningflagging-dead-sensor%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    1














    There are several ways to deal with this problem. I will first pass to differences:



    %matplotlib inline
    import pandas as pd
    import numpy as np

    np.random.seed(0)
    n = 200
    y = np.cumsum(np.random.randn(n))

    y[100:120] = 2
    y[150:160] = 0

    ts = pd.Series(y)
    ts.diff().plot();


    enter image description here



    The next step is to find how long are the strikes of consecutive zeros.



    def getZeroStrikeLen(x):
    """ Accept a boolean array only
    """
    res = np.diff(np.where(np.concatenate(([x[0]],
    x[:-1] != x[1:],
    [True])))[0])[::2]
    return res

    vec = ts.diff().values == 0
    out = getZeroStrikeLen(vec)


    Now if len(out)>0 you can conclude that there is a problem. If you want to go one step further you can have a look to this. It is in R but it's not that hard to replicate in Python.






    share|improve this answer



























      1














      There are several ways to deal with this problem. I will first pass to differences:



      %matplotlib inline
      import pandas as pd
      import numpy as np

      np.random.seed(0)
      n = 200
      y = np.cumsum(np.random.randn(n))

      y[100:120] = 2
      y[150:160] = 0

      ts = pd.Series(y)
      ts.diff().plot();


      enter image description here



      The next step is to find how long are the strikes of consecutive zeros.



      def getZeroStrikeLen(x):
      """ Accept a boolean array only
      """
      res = np.diff(np.where(np.concatenate(([x[0]],
      x[:-1] != x[1:],
      [True])))[0])[::2]
      return res

      vec = ts.diff().values == 0
      out = getZeroStrikeLen(vec)


      Now if len(out)>0 you can conclude that there is a problem. If you want to go one step further you can have a look to this. It is in R but it's not that hard to replicate in Python.






      share|improve this answer

























        1












        1








        1







        There are several ways to deal with this problem. I will first pass to differences:



        %matplotlib inline
        import pandas as pd
        import numpy as np

        np.random.seed(0)
        n = 200
        y = np.cumsum(np.random.randn(n))

        y[100:120] = 2
        y[150:160] = 0

        ts = pd.Series(y)
        ts.diff().plot();


        enter image description here



        The next step is to find how long are the strikes of consecutive zeros.



        def getZeroStrikeLen(x):
        """ Accept a boolean array only
        """
        res = np.diff(np.where(np.concatenate(([x[0]],
        x[:-1] != x[1:],
        [True])))[0])[::2]
        return res

        vec = ts.diff().values == 0
        out = getZeroStrikeLen(vec)


        Now if len(out)>0 you can conclude that there is a problem. If you want to go one step further you can have a look to this. It is in R but it's not that hard to replicate in Python.






        share|improve this answer













        There are several ways to deal with this problem. I will first pass to differences:



        %matplotlib inline
        import pandas as pd
        import numpy as np

        np.random.seed(0)
        n = 200
        y = np.cumsum(np.random.randn(n))

        y[100:120] = 2
        y[150:160] = 0

        ts = pd.Series(y)
        ts.diff().plot();


        enter image description here



        The next step is to find how long are the strikes of consecutive zeros.



        def getZeroStrikeLen(x):
        """ Accept a boolean array only
        """
        res = np.diff(np.where(np.concatenate(([x[0]],
        x[:-1] != x[1:],
        [True])))[0])[::2]
        return res

        vec = ts.diff().values == 0
        out = getZeroStrikeLen(vec)


        Now if len(out)>0 you can conclude that there is a problem. If you want to go one step further you can have a look to this. It is in R but it's not that hard to replicate in Python.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 15 '18 at 11:45









        user32185user32185

        2,13311028




        2,13311028





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53318379%2fdata-cleaningflagging-dead-sensor%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to how show current date and time by default on contact form 7 in WordPress without taking input from user in datetimepicker

            Syphilis

            Darth Vader #20