Find non-overlap instances (5-day stock returns) of rolling window product which meet condition
up vote
0
down vote
favorite
I have a time series of Amazon close price from 31 Dec 2009 to present.
I am trying to find the number of instances where 5-day returns of Amzn falls more than 15%
Date Open High Low Close Adj Close Volume
2009-12-30 138.399994 138.399994 135.279999 136.490005 136.490005 6913200
2009-12-31 137.089996 137.279999 134.520004 134.520004 134.520004 4523000
## to get the 1-day returns
df['returns'] = df['Close'] / df['Close'].shift(1)
## to get the rolling 5-day performance
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
## filter returns -15% or more
df2 = df[df['roll']<.85]
While the above outputs a table of rows where 5-day returns is -15% or more, the output of the df2 has overlapping 5-day windows.
1) How can I output in a list the rolling 5-day window as a column in the data frame => for example : ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']
2) How can I remove overlapping periods in df2. If there is any days in the 5-day window that overlaps with another row, only keep 1 of the row.
python pandas dataframe machine-learning quantitative-finance
add a comment |
up vote
0
down vote
favorite
I have a time series of Amazon close price from 31 Dec 2009 to present.
I am trying to find the number of instances where 5-day returns of Amzn falls more than 15%
Date Open High Low Close Adj Close Volume
2009-12-30 138.399994 138.399994 135.279999 136.490005 136.490005 6913200
2009-12-31 137.089996 137.279999 134.520004 134.520004 134.520004 4523000
## to get the 1-day returns
df['returns'] = df['Close'] / df['Close'].shift(1)
## to get the rolling 5-day performance
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
## filter returns -15% or more
df2 = df[df['roll']<.85]
While the above outputs a table of rows where 5-day returns is -15% or more, the output of the df2 has overlapping 5-day windows.
1) How can I output in a list the rolling 5-day window as a column in the data frame => for example : ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']
2) How can I remove overlapping periods in df2. If there is any days in the 5-day window that overlaps with another row, only keep 1 of the row.
python pandas dataframe machine-learning quantitative-finance
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have a time series of Amazon close price from 31 Dec 2009 to present.
I am trying to find the number of instances where 5-day returns of Amzn falls more than 15%
Date Open High Low Close Adj Close Volume
2009-12-30 138.399994 138.399994 135.279999 136.490005 136.490005 6913200
2009-12-31 137.089996 137.279999 134.520004 134.520004 134.520004 4523000
## to get the 1-day returns
df['returns'] = df['Close'] / df['Close'].shift(1)
## to get the rolling 5-day performance
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
## filter returns -15% or more
df2 = df[df['roll']<.85]
While the above outputs a table of rows where 5-day returns is -15% or more, the output of the df2 has overlapping 5-day windows.
1) How can I output in a list the rolling 5-day window as a column in the data frame => for example : ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']
2) How can I remove overlapping periods in df2. If there is any days in the 5-day window that overlaps with another row, only keep 1 of the row.
python pandas dataframe machine-learning quantitative-finance
I have a time series of Amazon close price from 31 Dec 2009 to present.
I am trying to find the number of instances where 5-day returns of Amzn falls more than 15%
Date Open High Low Close Adj Close Volume
2009-12-30 138.399994 138.399994 135.279999 136.490005 136.490005 6913200
2009-12-31 137.089996 137.279999 134.520004 134.520004 134.520004 4523000
## to get the 1-day returns
df['returns'] = df['Close'] / df['Close'].shift(1)
## to get the rolling 5-day performance
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
## filter returns -15% or more
df2 = df[df['roll']<.85]
While the above outputs a table of rows where 5-day returns is -15% or more, the output of the df2 has overlapping 5-day windows.
1) How can I output in a list the rolling 5-day window as a column in the data frame => for example : ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']
2) How can I remove overlapping periods in df2. If there is any days in the 5-day window that overlaps with another row, only keep 1 of the row.
python pandas dataframe machine-learning quantitative-finance
python pandas dataframe machine-learning quantitative-finance
edited Nov 9 at 13:23
asked Nov 9 at 12:52
Lko
506
506
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
I'm not 100% sure what you mean in question 1, but if you want to extract all the dates from the dataframe index and put them into a list you can use list(df.index)
. If you meant that you would like to have a column which contains the dates used in each 5-day rolling window then this is likely to be non-trivial and a bit hacky. Pandas won't allow you just to calculate this with a roll. However here is a working solution
def find_dates(df, center=False):
dates = df.index.values
dates_list = np.zeros((dates.shape[0],5), dtype=object)
if center:
for i,r in enumerate([2,1,0,-1,-2]):
dates_list[:,i]=np.roll(dates,r)
nul_dates_numbers = [0,1,len(dates)-1,len(dates)-2]
else:
for i,r in enumerate([4,3,2,1,0]):
dates_list[:,i]=np.roll(dates,r)
nul_dates_numbers = [0,1,2,3]
dates_list = [list(d) if j not in nul_dates_numbers else None for j,d in enumerate(dates_list)]
return dates_list
# make a quick dataframe
index_leters = 'a b c d e f g h i j'
indexes = index_leters.split(' ')
df = pd.DataFrame('B': list(np.arange(len(indexes))), index=indexes)
center = False # can set to False
#apply rolling function
df['roll']= pd.rolling_apply(df.B,5,lambda x : x.prod(), center=center)
# extract index windows (will work on dates)
df['dates'] = find_dates(df, center=center)
2) Assuming that you have data for each date then a simple way of removing overlapping periods is just to slice the dataframe to only keep every 5 rows
## to get the rolling 5-day performance
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
df_nonoverlapping = df[::5]
Also if you would like to have the date assigned to each window as the middle element in your window. ie in the window ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']
you want the date to be '2010-1-1'
then you should use
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod(), center=True)
The way you have it now then date index for each 5 day window will be set as the last one, ie '2010-1-3'
in this case.
Finally you should note that you will have a few NaN values in your 'roll' column because you can't calculate the 5 day rolling average at both the start and end of your dataframe. SO expect to have 4 NaN values in your 'roll' column.
The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
I'm not 100% sure what you mean in question 1, but if you want to extract all the dates from the dataframe index and put them into a list you can use list(df.index)
. If you meant that you would like to have a column which contains the dates used in each 5-day rolling window then this is likely to be non-trivial and a bit hacky. Pandas won't allow you just to calculate this with a roll. However here is a working solution
def find_dates(df, center=False):
dates = df.index.values
dates_list = np.zeros((dates.shape[0],5), dtype=object)
if center:
for i,r in enumerate([2,1,0,-1,-2]):
dates_list[:,i]=np.roll(dates,r)
nul_dates_numbers = [0,1,len(dates)-1,len(dates)-2]
else:
for i,r in enumerate([4,3,2,1,0]):
dates_list[:,i]=np.roll(dates,r)
nul_dates_numbers = [0,1,2,3]
dates_list = [list(d) if j not in nul_dates_numbers else None for j,d in enumerate(dates_list)]
return dates_list
# make a quick dataframe
index_leters = 'a b c d e f g h i j'
indexes = index_leters.split(' ')
df = pd.DataFrame('B': list(np.arange(len(indexes))), index=indexes)
center = False # can set to False
#apply rolling function
df['roll']= pd.rolling_apply(df.B,5,lambda x : x.prod(), center=center)
# extract index windows (will work on dates)
df['dates'] = find_dates(df, center=center)
2) Assuming that you have data for each date then a simple way of removing overlapping periods is just to slice the dataframe to only keep every 5 rows
## to get the rolling 5-day performance
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
df_nonoverlapping = df[::5]
Also if you would like to have the date assigned to each window as the middle element in your window. ie in the window ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']
you want the date to be '2010-1-1'
then you should use
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod(), center=True)
The way you have it now then date index for each 5 day window will be set as the last one, ie '2010-1-3'
in this case.
Finally you should note that you will have a few NaN values in your 'roll' column because you can't calculate the 5 day rolling average at both the start and end of your dataframe. SO expect to have 4 NaN values in your 'roll' column.
The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55
add a comment |
up vote
0
down vote
I'm not 100% sure what you mean in question 1, but if you want to extract all the dates from the dataframe index and put them into a list you can use list(df.index)
. If you meant that you would like to have a column which contains the dates used in each 5-day rolling window then this is likely to be non-trivial and a bit hacky. Pandas won't allow you just to calculate this with a roll. However here is a working solution
def find_dates(df, center=False):
dates = df.index.values
dates_list = np.zeros((dates.shape[0],5), dtype=object)
if center:
for i,r in enumerate([2,1,0,-1,-2]):
dates_list[:,i]=np.roll(dates,r)
nul_dates_numbers = [0,1,len(dates)-1,len(dates)-2]
else:
for i,r in enumerate([4,3,2,1,0]):
dates_list[:,i]=np.roll(dates,r)
nul_dates_numbers = [0,1,2,3]
dates_list = [list(d) if j not in nul_dates_numbers else None for j,d in enumerate(dates_list)]
return dates_list
# make a quick dataframe
index_leters = 'a b c d e f g h i j'
indexes = index_leters.split(' ')
df = pd.DataFrame('B': list(np.arange(len(indexes))), index=indexes)
center = False # can set to False
#apply rolling function
df['roll']= pd.rolling_apply(df.B,5,lambda x : x.prod(), center=center)
# extract index windows (will work on dates)
df['dates'] = find_dates(df, center=center)
2) Assuming that you have data for each date then a simple way of removing overlapping periods is just to slice the dataframe to only keep every 5 rows
## to get the rolling 5-day performance
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
df_nonoverlapping = df[::5]
Also if you would like to have the date assigned to each window as the middle element in your window. ie in the window ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']
you want the date to be '2010-1-1'
then you should use
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod(), center=True)
The way you have it now then date index for each 5 day window will be set as the last one, ie '2010-1-3'
in this case.
Finally you should note that you will have a few NaN values in your 'roll' column because you can't calculate the 5 day rolling average at both the start and end of your dataframe. SO expect to have 4 NaN values in your 'roll' column.
The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55
add a comment |
up vote
0
down vote
up vote
0
down vote
I'm not 100% sure what you mean in question 1, but if you want to extract all the dates from the dataframe index and put them into a list you can use list(df.index)
. If you meant that you would like to have a column which contains the dates used in each 5-day rolling window then this is likely to be non-trivial and a bit hacky. Pandas won't allow you just to calculate this with a roll. However here is a working solution
def find_dates(df, center=False):
dates = df.index.values
dates_list = np.zeros((dates.shape[0],5), dtype=object)
if center:
for i,r in enumerate([2,1,0,-1,-2]):
dates_list[:,i]=np.roll(dates,r)
nul_dates_numbers = [0,1,len(dates)-1,len(dates)-2]
else:
for i,r in enumerate([4,3,2,1,0]):
dates_list[:,i]=np.roll(dates,r)
nul_dates_numbers = [0,1,2,3]
dates_list = [list(d) if j not in nul_dates_numbers else None for j,d in enumerate(dates_list)]
return dates_list
# make a quick dataframe
index_leters = 'a b c d e f g h i j'
indexes = index_leters.split(' ')
df = pd.DataFrame('B': list(np.arange(len(indexes))), index=indexes)
center = False # can set to False
#apply rolling function
df['roll']= pd.rolling_apply(df.B,5,lambda x : x.prod(), center=center)
# extract index windows (will work on dates)
df['dates'] = find_dates(df, center=center)
2) Assuming that you have data for each date then a simple way of removing overlapping periods is just to slice the dataframe to only keep every 5 rows
## to get the rolling 5-day performance
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
df_nonoverlapping = df[::5]
Also if you would like to have the date assigned to each window as the middle element in your window. ie in the window ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']
you want the date to be '2010-1-1'
then you should use
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod(), center=True)
The way you have it now then date index for each 5 day window will be set as the last one, ie '2010-1-3'
in this case.
Finally you should note that you will have a few NaN values in your 'roll' column because you can't calculate the 5 day rolling average at both the start and end of your dataframe. SO expect to have 4 NaN values in your 'roll' column.
I'm not 100% sure what you mean in question 1, but if you want to extract all the dates from the dataframe index and put them into a list you can use list(df.index)
. If you meant that you would like to have a column which contains the dates used in each 5-day rolling window then this is likely to be non-trivial and a bit hacky. Pandas won't allow you just to calculate this with a roll. However here is a working solution
def find_dates(df, center=False):
dates = df.index.values
dates_list = np.zeros((dates.shape[0],5), dtype=object)
if center:
for i,r in enumerate([2,1,0,-1,-2]):
dates_list[:,i]=np.roll(dates,r)
nul_dates_numbers = [0,1,len(dates)-1,len(dates)-2]
else:
for i,r in enumerate([4,3,2,1,0]):
dates_list[:,i]=np.roll(dates,r)
nul_dates_numbers = [0,1,2,3]
dates_list = [list(d) if j not in nul_dates_numbers else None for j,d in enumerate(dates_list)]
return dates_list
# make a quick dataframe
index_leters = 'a b c d e f g h i j'
indexes = index_leters.split(' ')
df = pd.DataFrame('B': list(np.arange(len(indexes))), index=indexes)
center = False # can set to False
#apply rolling function
df['roll']= pd.rolling_apply(df.B,5,lambda x : x.prod(), center=center)
# extract index windows (will work on dates)
df['dates'] = find_dates(df, center=center)
2) Assuming that you have data for each date then a simple way of removing overlapping periods is just to slice the dataframe to only keep every 5 rows
## to get the rolling 5-day performance
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod())
df_nonoverlapping = df[::5]
Also if you would like to have the date assigned to each window as the middle element in your window. ie in the window ['2009-12-30',2009-12-31','2010-1-1','2010-1-2','2010-1-3']
you want the date to be '2010-1-1'
then you should use
df['roll']= pd.rolling_apply(df.returns,5,lambda x : x.prod(), center=True)
The way you have it now then date index for each 5 day window will be set as the last one, ie '2010-1-3'
in this case.
Finally you should note that you will have a few NaN values in your 'roll' column because you can't calculate the 5 day rolling average at both the start and end of your dataframe. SO expect to have 4 NaN values in your 'roll' column.
edited Nov 9 at 15:36
answered Nov 9 at 13:52
James Fulton
212
212
The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55
add a comment |
The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55
The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55
The removal of overlapping windows comes after the filtering condition. So after filtering for rolling windows where returns is -10% or more, there could be rows where the periods overlap. I want to remove those periods. df[::5] does not work in this case as it comes before the filtering condition
– Lko
Nov 9 at 13:55
add a comment |
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53226063%2ffind-non-overlap-instances-5-day-stock-returns-of-rolling-window-product-which%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password