Find non-overlap instances (5-day stock returns) of rolling window product which meet condition

up vote
0
down vote

favorite

I have a time series of Amazon close price from 31 Dec 2009 to present.

I am trying to find the number of instances where 5-day returns of Amzn falls more than 15%

Date Open High Low Close Adj Close Volume
2009-12-30 138.399994 138.399994 135.279999 136.490005 136.490005 6913200
2009-12-31 137.089996 137.279999 134.520004 134.520004 134.520004 4523000



## to get the 1-day returns 
df['returns'] = df['Close'] / df['Close'].shift(1) 

## to get the rolling 5-day performance 
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod()) 


## filter returns -15% or more 
df2 = df[df['roll']<.85]

While the above outputs a table of rows where 5-day returns is -15% or more, the output of the df2 has overlapping 5-day windows.

1) How can I output in a list the rolling 5-day window as a column in the data frame => for example : ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']

2) How can I remove overlapping periods in df2. If there is any days in the 5-day window that overlaps with another row, only keep 1 of the row.

edited Nov 9 at 13:23

asked Nov 9 at 12:52

Lko

506

add a comment |

up vote
0
down vote

favorite

I have a time series of Amazon close price from 31 Dec 2009 to present.

I am trying to find the number of instances where 5-day returns of Amzn falls more than 15%

Date Open High Low Close Adj Close Volume
2009-12-30 138.399994 138.399994 135.279999 136.490005 136.490005 6913200
2009-12-31 137.089996 137.279999 134.520004 134.520004 134.520004 4523000



## to get the 1-day returns 
df['returns'] = df['Close'] / df['Close'].shift(1) 

## to get the rolling 5-day performance 
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod()) 


## filter returns -15% or more 
df2 = df[df['roll']<.85]

While the above outputs a table of rows where 5-day returns is -15% or more, the output of the df2 has overlapping 5-day windows.

1) How can I output in a list the rolling 5-day window as a column in the data frame => for example : ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']

2) How can I remove overlapping periods in df2. If there is any days in the 5-day window that overlaps with another row, only keep 1 of the row.

edited Nov 9 at 13:23

asked Nov 9 at 12:52

Lko

506

add a comment |

up vote
0
down vote

favorite

I have a time series of Amazon close price from 31 Dec 2009 to present.

I am trying to find the number of instances where 5-day returns of Amzn falls more than 15%

Date Open High Low Close Adj Close Volume
2009-12-30 138.399994 138.399994 135.279999 136.490005 136.490005 6913200
2009-12-31 137.089996 137.279999 134.520004 134.520004 134.520004 4523000



## to get the 1-day returns 
df['returns'] = df['Close'] / df['Close'].shift(1) 

## to get the rolling 5-day performance 
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod()) 


## filter returns -15% or more 
df2 = df[df['roll']<.85]

While the above outputs a table of rows where 5-day returns is -15% or more, the output of the df2 has overlapping 5-day windows.

1) How can I output in a list the rolling 5-day window as a column in the data frame => for example : ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']

2) How can I remove overlapping periods in df2. If there is any days in the 5-day window that overlaps with another row, only keep 1 of the row.

edited Nov 9 at 13:23

asked Nov 9 at 12:52

Lko

506

I have a time series of Amazon close price from 31 Dec 2009 to present.

I am trying to find the number of instances where 5-day returns of Amzn falls more than 15%

Date Open High Low Close Adj Close Volume
2009-12-30 138.399994 138.399994 135.279999 136.490005 136.490005 6913200
2009-12-31 137.089996 137.279999 134.520004 134.520004 134.520004 4523000



## to get the 1-day returns 
df['returns'] = df['Close'] / df['Close'].shift(1) 

## to get the rolling 5-day performance 
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod()) 


## filter returns -15% or more 
df2 = df[df['roll']<.85]

While the above outputs a table of rows where 5-day returns is -15% or more, the output of the df2 has overlapping 5-day windows.

1) How can I output in a list the rolling 5-day window as a column in the data frame => for example : ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']

2) How can I remove overlapping periods in df2. If there is any days in the 5-day window that overlaps with another row, only keep 1 of the row.

python pandas dataframe machine-learning quantitative-finance

edited Nov 9 at 13:23

asked Nov 9 at 12:52

Lko

506

edited Nov 9 at 13:23

asked Nov 9 at 12:52

Lko

506

edited Nov 9 at 13:23

asked Nov 9 at 12:52

Lko

506

asked Nov 9 at 12:52

Lko

506

asked Nov 9 at 12:52

Lko

506

add a comment |

1 Answer
1

active

oldest

votes

up vote
0
down vote

I'm not 100% sure what you mean in question 1, but if you want to extract all the dates from the dataframe index and put them into a list you can use list(df.index). If you meant that you would like to have a column which contains the dates used in each 5-day rolling window then this is likely to be non-trivial and a bit hacky. Pandas won't allow you just to calculate this with a roll. However here is a working solution

def find_dates(df, center=False):

 dates = df.index.values
 dates_list = np.zeros((dates.shape[0],5), dtype=object)
 if center:
 for i,r in enumerate([2,1,0,-1,-2]):
 dates_list[:,i]=np.roll(dates,r)
 nul_dates_numbers = [0,1,len(dates)-1,len(dates)-2]

 else:
 for i,r in enumerate([4,3,2,1,0]):
 dates_list[:,i]=np.roll(dates,r)
 nul_dates_numbers = [0,1,2,3]

 dates_list = [list(d) if j not in nul_dates_numbers else None for j,d in enumerate(dates_list)]
 return dates_list


# make a quick dataframe
index_leters = 'a b c d e f g h i j'
indexes = index_leters.split(' ')
df = pd.DataFrame('B': list(np.arange(len(indexes))), index=indexes)

center = False # can set to False
#apply rolling function 
df['roll']= pd.rolling_apply(df.B,5,lambda x : x.prod(), center=center) 

# extract index windows (will work on dates)
df['dates'] = find_dates(df, center=center)

2) Assuming that you have data for each date then a simple way of removing overlapping periods is just to slice the dataframe to only keep every 5 rows

## to get the rolling 5-day performance 
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
df_nonoverlapping = df[::5]

Also if you would like to have the date assigned to each window as the middle element in your window. ie in the window ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3'] you want the date to be '2010-1-1' then you should use

df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod(), center=True)

The way you have it now then date index for each 5 day window will be set as the last one, ie '2010-1-3' in this case.

Finally you should note that you will have a few NaN values in your 'roll' column because you can't calculate the 5 day rolling average at both the start and end of your dataframe. SO expect to have 4 NaN values in your 'roll' column.

edited Nov 9 at 15:36

answered Nov 9 at 13:52

James Fulton

212

The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53226063%2ffind-non-overlap-instances-5-day-stock-returns-of-rolling-window-product-which%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
0
down vote

def find_dates(df, center=False):

 dates = df.index.values
 dates_list = np.zeros((dates.shape[0],5), dtype=object)
 if center:
 for i,r in enumerate([2,1,0,-1,-2]):
 dates_list[:,i]=np.roll(dates,r)
 nul_dates_numbers = [0,1,len(dates)-1,len(dates)-2]

 else:
 for i,r in enumerate([4,3,2,1,0]):
 dates_list[:,i]=np.roll(dates,r)
 nul_dates_numbers = [0,1,2,3]

 dates_list = [list(d) if j not in nul_dates_numbers else None for j,d in enumerate(dates_list)]
 return dates_list


# make a quick dataframe
index_leters = 'a b c d e f g h i j'
indexes = index_leters.split(' ')
df = pd.DataFrame('B': list(np.arange(len(indexes))), index=indexes)

center = False # can set to False
#apply rolling function 
df['roll']= pd.rolling_apply(df.B,5,lambda x : x.prod(), center=center) 

# extract index windows (will work on dates)
df['dates'] = find_dates(df, center=center)

2) Assuming that you have data for each date then a simple way of removing overlapping periods is just to slice the dataframe to only keep every 5 rows

## to get the rolling 5-day performance 
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
df_nonoverlapping = df[::5]

df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod(), center=True)

The way you have it now then date index for each 5 day window will be set as the last one, ie '2010-1-3' in this case.

edited Nov 9 at 15:36

answered Nov 9 at 13:52

James Fulton

212

The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55

add a comment |

up vote
0
down vote

def find_dates(df, center=False):

 dates = df.index.values
 dates_list = np.zeros((dates.shape[0],5), dtype=object)
 if center:
 for i,r in enumerate([2,1,0,-1,-2]):
 dates_list[:,i]=np.roll(dates,r)
 nul_dates_numbers = [0,1,len(dates)-1,len(dates)-2]

 else:
 for i,r in enumerate([4,3,2,1,0]):
 dates_list[:,i]=np.roll(dates,r)
 nul_dates_numbers = [0,1,2,3]

 dates_list = [list(d) if j not in nul_dates_numbers else None for j,d in enumerate(dates_list)]
 return dates_list


# make a quick dataframe
index_leters = 'a b c d e f g h i j'
indexes = index_leters.split(' ')
df = pd.DataFrame('B': list(np.arange(len(indexes))), index=indexes)

center = False # can set to False
#apply rolling function 
df['roll']= pd.rolling_apply(df.B,5,lambda x : x.prod(), center=center) 

# extract index windows (will work on dates)
df['dates'] = find_dates(df, center=center)

2) Assuming that you have data for each date then a simple way of removing overlapping periods is just to slice the dataframe to only keep every 5 rows

## to get the rolling 5-day performance 
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
df_nonoverlapping = df[::5]

df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod(), center=True)

The way you have it now then date index for each 5 day window will be set as the last one, ie '2010-1-3' in this case.

edited Nov 9 at 15:36

answered Nov 9 at 13:52

James Fulton

212

The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55

add a comment |

up vote
0
down vote

def find_dates(df, center=False):

 dates = df.index.values
 dates_list = np.zeros((dates.shape[0],5), dtype=object)
 if center:
 for i,r in enumerate([2,1,0,-1,-2]):
 dates_list[:,i]=np.roll(dates,r)
 nul_dates_numbers = [0,1,len(dates)-1,len(dates)-2]

 else:
 for i,r in enumerate([4,3,2,1,0]):
 dates_list[:,i]=np.roll(dates,r)
 nul_dates_numbers = [0,1,2,3]

 dates_list = [list(d) if j not in nul_dates_numbers else None for j,d in enumerate(dates_list)]
 return dates_list


# make a quick dataframe
index_leters = 'a b c d e f g h i j'
indexes = index_leters.split(' ')
df = pd.DataFrame('B': list(np.arange(len(indexes))), index=indexes)

center = False # can set to False
#apply rolling function 
df['roll']= pd.rolling_apply(df.B,5,lambda x : x.prod(), center=center) 

# extract index windows (will work on dates)
df['dates'] = find_dates(df, center=center)

2) Assuming that you have data for each date then a simple way of removing overlapping periods is just to slice the dataframe to only keep every 5 rows

## to get the rolling 5-day performance 
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
df_nonoverlapping = df[::5]

df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod(), center=True)

The way you have it now then date index for each 5 day window will be set as the last one, ie '2010-1-3' in this case.

edited Nov 9 at 15:36

answered Nov 9 at 13:52

James Fulton

212

def find_dates(df, center=False):

 dates = df.index.values
 dates_list = np.zeros((dates.shape[0],5), dtype=object)
 if center:
 for i,r in enumerate([2,1,0,-1,-2]):
 dates_list[:,i]=np.roll(dates,r)
 nul_dates_numbers = [0,1,len(dates)-1,len(dates)-2]

 else:
 for i,r in enumerate([4,3,2,1,0]):
 dates_list[:,i]=np.roll(dates,r)
 nul_dates_numbers = [0,1,2,3]

 dates_list = [list(d) if j not in nul_dates_numbers else None for j,d in enumerate(dates_list)]
 return dates_list


# make a quick dataframe
index_leters = 'a b c d e f g h i j'
indexes = index_leters.split(' ')
df = pd.DataFrame('B': list(np.arange(len(indexes))), index=indexes)

center = False # can set to False
#apply rolling function 
df['roll']= pd.rolling_apply(df.B,5,lambda x : x.prod(), center=center) 

# extract index windows (will work on dates)
df['dates'] = find_dates(df, center=center)

2) Assuming that you have data for each date then a simple way of removing overlapping periods is just to slice the dataframe to only keep every 5 rows

## to get the rolling 5-day performance 
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
df_nonoverlapping = df[::5]

df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod(), center=True)

The way you have it now then date index for each 5 day window will be set as the last one, ie '2010-1-3' in this case.

edited Nov 9 at 15:36

answered Nov 9 at 13:52

James Fulton

212

edited Nov 9 at 15:36

answered Nov 9 at 13:52

James Fulton

212

answered Nov 9 at 13:52

James Fulton

212

answered Nov 9 at 13:52

James Fulton

212

The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55

add a comment |

The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55

The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

This page is only for reference, If you need detailed information, please check here

3PdXQ9Ugvz7Z8ZFjveVf,k4 iWS5Ol,YdqYDP7TZ4PJEEqW HL,6KQlBWv,YSzjQ5,UGvJfYimLB7AI

搜尋此網誌

Pfthb