How to perform rolling for loop in a Pandas Dataframe?
I have a pandas df with 3 columns:
Close Top_Barrier Bottom_Barrier
0 441.86 441.964112 426.369888
1 448.95 444.162225 425.227108
2 449.99 446.222271 424.285063
3 449.74 447.947051 423.678282
4 451.97 449.879254 423.029413
...
996 436.97 446.468790 426.600543
997 438.16 446.461401 426.599265
998 437.00 446.093899 426.641434
999 437.52 446.024365 426.631635
1000 437.75 446.114093 426.715907
Objective:
For every row, I need to test if any of the next 30 rows Close price touches the top or bottom barrier (from row 0), eg, start from row index 0, test if Close price (441.86) is greater than Top_Barrier (441.96) or lower than Bottom_Barrier (426.36), if it is greater than Top_Barrier, return 1, if it is lower than Bottom_Barrier, return -1. Else, loop to the next row, eg, at index 1, Close price is 448.95, but it is still being tested against barrier price from index 0, ie, Top_Barrier of 441.96, Bottom_Barrier of 426.36. This loop continue until index 29 if Close price never touches the barriers - return 0 if that's the case. Next rolling loop start from index 1 until 30, etc.
Attempts:
I tried using .rolling.apply with the following function but I just could not resolve the errors. Happy to explore any other methods as long as it achieve my objective stated above. Thanks!
def tbl_rolling(x):
start_i = x.index[0]
for i in range(len(x)):
# the barrier freeze at index 0
if x.loc[i, 'Close'] > x.loc[start_i, 'Top_Barrier']:
return 1
elif x.loc[i, 'Close'] < x.loc[start_i, 'Bottom_Barrier']:
return -1
return 0
The following then throws IndexingError: Too many indexers
test = df.rolling(30).apply(tbl_rolling, raw=False)
python pandas for-loop apply rolling
add a comment |
I have a pandas df with 3 columns:
Close Top_Barrier Bottom_Barrier
0 441.86 441.964112 426.369888
1 448.95 444.162225 425.227108
2 449.99 446.222271 424.285063
3 449.74 447.947051 423.678282
4 451.97 449.879254 423.029413
...
996 436.97 446.468790 426.600543
997 438.16 446.461401 426.599265
998 437.00 446.093899 426.641434
999 437.52 446.024365 426.631635
1000 437.75 446.114093 426.715907
Objective:
For every row, I need to test if any of the next 30 rows Close price touches the top or bottom barrier (from row 0), eg, start from row index 0, test if Close price (441.86) is greater than Top_Barrier (441.96) or lower than Bottom_Barrier (426.36), if it is greater than Top_Barrier, return 1, if it is lower than Bottom_Barrier, return -1. Else, loop to the next row, eg, at index 1, Close price is 448.95, but it is still being tested against barrier price from index 0, ie, Top_Barrier of 441.96, Bottom_Barrier of 426.36. This loop continue until index 29 if Close price never touches the barriers - return 0 if that's the case. Next rolling loop start from index 1 until 30, etc.
Attempts:
I tried using .rolling.apply with the following function but I just could not resolve the errors. Happy to explore any other methods as long as it achieve my objective stated above. Thanks!
def tbl_rolling(x):
start_i = x.index[0]
for i in range(len(x)):
# the barrier freeze at index 0
if x.loc[i, 'Close'] > x.loc[start_i, 'Top_Barrier']:
return 1
elif x.loc[i, 'Close'] < x.loc[start_i, 'Bottom_Barrier']:
return -1
return 0
The following then throws IndexingError: Too many indexers
test = df.rolling(30).apply(tbl_rolling, raw=False)
python pandas for-loop apply rolling
What happens if a close prices is Above the Top_Barrier for one records and below the bottom barrier for another record within the thirty days? What do you want to return?
– Scott Boston
Nov 13 '18 at 14:42
Sorry, I should have mentioned this - the for loop shall terminate when it touches any of the top or bottom barrier.
– Kok Wei Hoo
Nov 13 '18 at 15:01
Ultimately the expected outcome is to have a column of labels (1, -1, or 0) indicating that for each row, whether the Close price within the next 30 rows, first touches the top (1), or the bottom (-1), or it stays within the barriers throughout (0).
– Kok Wei Hoo
Nov 13 '18 at 15:15
add a comment |
I have a pandas df with 3 columns:
Close Top_Barrier Bottom_Barrier
0 441.86 441.964112 426.369888
1 448.95 444.162225 425.227108
2 449.99 446.222271 424.285063
3 449.74 447.947051 423.678282
4 451.97 449.879254 423.029413
...
996 436.97 446.468790 426.600543
997 438.16 446.461401 426.599265
998 437.00 446.093899 426.641434
999 437.52 446.024365 426.631635
1000 437.75 446.114093 426.715907
Objective:
For every row, I need to test if any of the next 30 rows Close price touches the top or bottom barrier (from row 0), eg, start from row index 0, test if Close price (441.86) is greater than Top_Barrier (441.96) or lower than Bottom_Barrier (426.36), if it is greater than Top_Barrier, return 1, if it is lower than Bottom_Barrier, return -1. Else, loop to the next row, eg, at index 1, Close price is 448.95, but it is still being tested against barrier price from index 0, ie, Top_Barrier of 441.96, Bottom_Barrier of 426.36. This loop continue until index 29 if Close price never touches the barriers - return 0 if that's the case. Next rolling loop start from index 1 until 30, etc.
Attempts:
I tried using .rolling.apply with the following function but I just could not resolve the errors. Happy to explore any other methods as long as it achieve my objective stated above. Thanks!
def tbl_rolling(x):
start_i = x.index[0]
for i in range(len(x)):
# the barrier freeze at index 0
if x.loc[i, 'Close'] > x.loc[start_i, 'Top_Barrier']:
return 1
elif x.loc[i, 'Close'] < x.loc[start_i, 'Bottom_Barrier']:
return -1
return 0
The following then throws IndexingError: Too many indexers
test = df.rolling(30).apply(tbl_rolling, raw=False)
python pandas for-loop apply rolling
I have a pandas df with 3 columns:
Close Top_Barrier Bottom_Barrier
0 441.86 441.964112 426.369888
1 448.95 444.162225 425.227108
2 449.99 446.222271 424.285063
3 449.74 447.947051 423.678282
4 451.97 449.879254 423.029413
...
996 436.97 446.468790 426.600543
997 438.16 446.461401 426.599265
998 437.00 446.093899 426.641434
999 437.52 446.024365 426.631635
1000 437.75 446.114093 426.715907
Objective:
For every row, I need to test if any of the next 30 rows Close price touches the top or bottom barrier (from row 0), eg, start from row index 0, test if Close price (441.86) is greater than Top_Barrier (441.96) or lower than Bottom_Barrier (426.36), if it is greater than Top_Barrier, return 1, if it is lower than Bottom_Barrier, return -1. Else, loop to the next row, eg, at index 1, Close price is 448.95, but it is still being tested against barrier price from index 0, ie, Top_Barrier of 441.96, Bottom_Barrier of 426.36. This loop continue until index 29 if Close price never touches the barriers - return 0 if that's the case. Next rolling loop start from index 1 until 30, etc.
Attempts:
I tried using .rolling.apply with the following function but I just could not resolve the errors. Happy to explore any other methods as long as it achieve my objective stated above. Thanks!
def tbl_rolling(x):
start_i = x.index[0]
for i in range(len(x)):
# the barrier freeze at index 0
if x.loc[i, 'Close'] > x.loc[start_i, 'Top_Barrier']:
return 1
elif x.loc[i, 'Close'] < x.loc[start_i, 'Bottom_Barrier']:
return -1
return 0
The following then throws IndexingError: Too many indexers
test = df.rolling(30).apply(tbl_rolling, raw=False)
python pandas for-loop apply rolling
python pandas for-loop apply rolling
asked Nov 13 '18 at 13:52
Kok Wei HooKok Wei Hoo
11
11
What happens if a close prices is Above the Top_Barrier for one records and below the bottom barrier for another record within the thirty days? What do you want to return?
– Scott Boston
Nov 13 '18 at 14:42
Sorry, I should have mentioned this - the for loop shall terminate when it touches any of the top or bottom barrier.
– Kok Wei Hoo
Nov 13 '18 at 15:01
Ultimately the expected outcome is to have a column of labels (1, -1, or 0) indicating that for each row, whether the Close price within the next 30 rows, first touches the top (1), or the bottom (-1), or it stays within the barriers throughout (0).
– Kok Wei Hoo
Nov 13 '18 at 15:15
add a comment |
What happens if a close prices is Above the Top_Barrier for one records and below the bottom barrier for another record within the thirty days? What do you want to return?
– Scott Boston
Nov 13 '18 at 14:42
Sorry, I should have mentioned this - the for loop shall terminate when it touches any of the top or bottom barrier.
– Kok Wei Hoo
Nov 13 '18 at 15:01
Ultimately the expected outcome is to have a column of labels (1, -1, or 0) indicating that for each row, whether the Close price within the next 30 rows, first touches the top (1), or the bottom (-1), or it stays within the barriers throughout (0).
– Kok Wei Hoo
Nov 13 '18 at 15:15
What happens if a close prices is Above the Top_Barrier for one records and below the bottom barrier for another record within the thirty days? What do you want to return?
– Scott Boston
Nov 13 '18 at 14:42
What happens if a close prices is Above the Top_Barrier for one records and below the bottom barrier for another record within the thirty days? What do you want to return?
– Scott Boston
Nov 13 '18 at 14:42
Sorry, I should have mentioned this - the for loop shall terminate when it touches any of the top or bottom barrier.
– Kok Wei Hoo
Nov 13 '18 at 15:01
Sorry, I should have mentioned this - the for loop shall terminate when it touches any of the top or bottom barrier.
– Kok Wei Hoo
Nov 13 '18 at 15:01
Ultimately the expected outcome is to have a column of labels (1, -1, or 0) indicating that for each row, whether the Close price within the next 30 rows, first touches the top (1), or the bottom (-1), or it stays within the barriers throughout (0).
– Kok Wei Hoo
Nov 13 '18 at 15:15
Ultimately the expected outcome is to have a column of labels (1, -1, or 0) indicating that for each row, whether the Close price within the next 30 rows, first touches the top (1), or the bottom (-1), or it stays within the barriers throughout (0).
– Kok Wei Hoo
Nov 13 '18 at 15:15
add a comment |
1 Answer
1
active
oldest
votes
You can try something like this if your dataset isn't very big:
df = df.reset_index().assign(key=1)
def f(x):
cond1 = x['Close_x'] > x['Top_Barrier_y'].max()
cond2 = x['Close_x'] < x['Bottom_Barrier_y'].min()
return np.select([cond1,cond2],[1,-1], default=0)[0]
df.merge(df, on='key').query('index_y <= index_x').groupby('index_x').apply(f)
Output:
index_x
0 0
1 1
2 1
3 1
4 1
996 0
997 0
998 0
999 0
1000 0
dtype: int64
The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...
– Kok Wei Hoo
Nov 13 '18 at 15:50
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53282558%2fhow-to-perform-rolling-for-loop-in-a-pandas-dataframe%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can try something like this if your dataset isn't very big:
df = df.reset_index().assign(key=1)
def f(x):
cond1 = x['Close_x'] > x['Top_Barrier_y'].max()
cond2 = x['Close_x'] < x['Bottom_Barrier_y'].min()
return np.select([cond1,cond2],[1,-1], default=0)[0]
df.merge(df, on='key').query('index_y <= index_x').groupby('index_x').apply(f)
Output:
index_x
0 0
1 1
2 1
3 1
4 1
996 0
997 0
998 0
999 0
1000 0
dtype: int64
The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...
– Kok Wei Hoo
Nov 13 '18 at 15:50
add a comment |
You can try something like this if your dataset isn't very big:
df = df.reset_index().assign(key=1)
def f(x):
cond1 = x['Close_x'] > x['Top_Barrier_y'].max()
cond2 = x['Close_x'] < x['Bottom_Barrier_y'].min()
return np.select([cond1,cond2],[1,-1], default=0)[0]
df.merge(df, on='key').query('index_y <= index_x').groupby('index_x').apply(f)
Output:
index_x
0 0
1 1
2 1
3 1
4 1
996 0
997 0
998 0
999 0
1000 0
dtype: int64
The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...
– Kok Wei Hoo
Nov 13 '18 at 15:50
add a comment |
You can try something like this if your dataset isn't very big:
df = df.reset_index().assign(key=1)
def f(x):
cond1 = x['Close_x'] > x['Top_Barrier_y'].max()
cond2 = x['Close_x'] < x['Bottom_Barrier_y'].min()
return np.select([cond1,cond2],[1,-1], default=0)[0]
df.merge(df, on='key').query('index_y <= index_x').groupby('index_x').apply(f)
Output:
index_x
0 0
1 1
2 1
3 1
4 1
996 0
997 0
998 0
999 0
1000 0
dtype: int64
You can try something like this if your dataset isn't very big:
df = df.reset_index().assign(key=1)
def f(x):
cond1 = x['Close_x'] > x['Top_Barrier_y'].max()
cond2 = x['Close_x'] < x['Bottom_Barrier_y'].min()
return np.select([cond1,cond2],[1,-1], default=0)[0]
df.merge(df, on='key').query('index_y <= index_x').groupby('index_x').apply(f)
Output:
index_x
0 0
1 1
2 1
3 1
4 1
996 0
997 0
998 0
999 0
1000 0
dtype: int64
answered Nov 13 '18 at 15:21
Scott BostonScott Boston
54.7k73056
54.7k73056
The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...
– Kok Wei Hoo
Nov 13 '18 at 15:50
add a comment |
The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...
– Kok Wei Hoo
Nov 13 '18 at 15:50
The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...
– Kok Wei Hoo
Nov 13 '18 at 15:50
The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...
– Kok Wei Hoo
Nov 13 '18 at 15:50
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53282558%2fhow-to-perform-rolling-for-loop-in-a-pandas-dataframe%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
What happens if a close prices is Above the Top_Barrier for one records and below the bottom barrier for another record within the thirty days? What do you want to return?
– Scott Boston
Nov 13 '18 at 14:42
Sorry, I should have mentioned this - the for loop shall terminate when it touches any of the top or bottom barrier.
– Kok Wei Hoo
Nov 13 '18 at 15:01
Ultimately the expected outcome is to have a column of labels (1, -1, or 0) indicating that for each row, whether the Close price within the next 30 rows, first touches the top (1), or the bottom (-1), or it stays within the barriers throughout (0).
– Kok Wei Hoo
Nov 13 '18 at 15:15