How to perform rolling for loop in a Pandas Dataframe?

I have a pandas df with 3 columns:

 Close Top_Barrier Bottom_Barrier
0 441.86 441.964112 426.369888
1 448.95 444.162225 425.227108
2 449.99 446.222271 424.285063
3 449.74 447.947051 423.678282
4 451.97 449.879254 423.029413
...
996 436.97 446.468790 426.600543
997 438.16 446.461401 426.599265
998 437.00 446.093899 426.641434
999 437.52 446.024365 426.631635
1000 437.75 446.114093 426.715907

Objective:
For every row, I need to test if any of the next 30 rows Close price touches the top or bottom barrier (from row 0), eg, start from row index 0, test if Close price (441.86) is greater than Top_Barrier (441.96) or lower than Bottom_Barrier (426.36), if it is greater than Top_Barrier, return 1, if it is lower than Bottom_Barrier, return -1. Else, loop to the next row, eg, at index 1, Close price is 448.95, but it is still being tested against barrier price from index 0, ie, Top_Barrier of 441.96, Bottom_Barrier of 426.36. This loop continue until index 29 if Close price never touches the barriers - return 0 if that's the case. Next rolling loop start from index 1 until 30, etc.

Attempts:
I tried using .rolling.apply with the following function but I just could not resolve the errors. Happy to explore any other methods as long as it achieve my objective stated above. Thanks!

def tbl_rolling(x):
 start_i = x.index[0]
 for i in range(len(x)):
 # the barrier freeze at index 0
 if x.loc[i, 'Close'] > x.loc[start_i, 'Top_Barrier']:
 return 1
 elif x.loc[i, 'Close'] < x.loc[start_i, 'Bottom_Barrier']:
 return -1
 return 0

The following then throws IndexingError: Too many indexers

test = df.rolling(30).apply(tbl_rolling, raw=False)

asked Nov 13 '18 at 13:52

Kok Wei Hoo

What happens if a close prices is Above the Top_Barrier for one records and below the bottom barrier for another record within the thirty days? What do you want to return?

– Scott Boston
Nov 13 '18 at 14:42

Sorry, I should have mentioned this - the for loop shall terminate when it touches any of the top or bottom barrier.

– Kok Wei Hoo
Nov 13 '18 at 15:01

Ultimately the expected outcome is to have a column of labels (1, -1, or 0) indicating that for each row, whether the Close price within the next 30 rows, first touches the top (1), or the bottom (-1), or it stays within the barriers throughout (0).

– Kok Wei Hoo
Nov 13 '18 at 15:15

add a comment |

I have a pandas df with 3 columns:

 Close Top_Barrier Bottom_Barrier
0 441.86 441.964112 426.369888
1 448.95 444.162225 425.227108
2 449.99 446.222271 424.285063
3 449.74 447.947051 423.678282
4 451.97 449.879254 423.029413
...
996 436.97 446.468790 426.600543
997 438.16 446.461401 426.599265
998 437.00 446.093899 426.641434
999 437.52 446.024365 426.631635
1000 437.75 446.114093 426.715907

def tbl_rolling(x):
 start_i = x.index[0]
 for i in range(len(x)):
 # the barrier freeze at index 0
 if x.loc[i, 'Close'] > x.loc[start_i, 'Top_Barrier']:
 return 1
 elif x.loc[i, 'Close'] < x.loc[start_i, 'Bottom_Barrier']:
 return -1
 return 0

The following then throws IndexingError: Too many indexers

test = df.rolling(30).apply(tbl_rolling, raw=False)

asked Nov 13 '18 at 13:52

Kok Wei Hoo

What happens if a close prices is Above the Top_Barrier for one records and below the bottom barrier for another record within the thirty days? What do you want to return?

– Scott Boston
Nov 13 '18 at 14:42

Sorry, I should have mentioned this - the for loop shall terminate when it touches any of the top or bottom barrier.

– Kok Wei Hoo
Nov 13 '18 at 15:01

Ultimately the expected outcome is to have a column of labels (1, -1, or 0) indicating that for each row, whether the Close price within the next 30 rows, first touches the top (1), or the bottom (-1), or it stays within the barriers throughout (0).

– Kok Wei Hoo
Nov 13 '18 at 15:15

add a comment |

I have a pandas df with 3 columns:

 Close Top_Barrier Bottom_Barrier
0 441.86 441.964112 426.369888
1 448.95 444.162225 425.227108
2 449.99 446.222271 424.285063
3 449.74 447.947051 423.678282
4 451.97 449.879254 423.029413
...
996 436.97 446.468790 426.600543
997 438.16 446.461401 426.599265
998 437.00 446.093899 426.641434
999 437.52 446.024365 426.631635
1000 437.75 446.114093 426.715907

def tbl_rolling(x):
 start_i = x.index[0]
 for i in range(len(x)):
 # the barrier freeze at index 0
 if x.loc[i, 'Close'] > x.loc[start_i, 'Top_Barrier']:
 return 1
 elif x.loc[i, 'Close'] < x.loc[start_i, 'Bottom_Barrier']:
 return -1
 return 0

The following then throws IndexingError: Too many indexers

test = df.rolling(30).apply(tbl_rolling, raw=False)

asked Nov 13 '18 at 13:52

Kok Wei Hoo

I have a pandas df with 3 columns:

 Close Top_Barrier Bottom_Barrier
0 441.86 441.964112 426.369888
1 448.95 444.162225 425.227108
2 449.99 446.222271 424.285063
3 449.74 447.947051 423.678282
4 451.97 449.879254 423.029413
...
996 436.97 446.468790 426.600543
997 438.16 446.461401 426.599265
998 437.00 446.093899 426.641434
999 437.52 446.024365 426.631635
1000 437.75 446.114093 426.715907

def tbl_rolling(x):
 start_i = x.index[0]
 for i in range(len(x)):
 # the barrier freeze at index 0
 if x.loc[i, 'Close'] > x.loc[start_i, 'Top_Barrier']:
 return 1
 elif x.loc[i, 'Close'] < x.loc[start_i, 'Bottom_Barrier']:
 return -1
 return 0

The following then throws IndexingError: Too many indexers

test = df.rolling(30).apply(tbl_rolling, raw=False)

python pandas for-loop apply rolling

asked Nov 13 '18 at 13:52

Kok Wei Hoo

asked Nov 13 '18 at 13:52

Kok Wei Hoo

asked Nov 13 '18 at 13:52

Kok Wei Hoo

asked Nov 13 '18 at 13:52

Kok Wei Hoo

asked Nov 13 '18 at 13:52

Kok Wei Hoo

What happens if a close prices is Above the Top_Barrier for one records and below the bottom barrier for another record within the thirty days? What do you want to return?

– Scott Boston
Nov 13 '18 at 14:42

Sorry, I should have mentioned this - the for loop shall terminate when it touches any of the top or bottom barrier.

– Kok Wei Hoo
Nov 13 '18 at 15:01

Ultimately the expected outcome is to have a column of labels (1, -1, or 0) indicating that for each row, whether the Close price within the next 30 rows, first touches the top (1), or the bottom (-1), or it stays within the barriers throughout (0).

– Kok Wei Hoo
Nov 13 '18 at 15:15

add a comment |

What happens if a close prices is Above the Top_Barrier for one records and below the bottom barrier for another record within the thirty days? What do you want to return?

– Scott Boston
Nov 13 '18 at 14:42

Sorry, I should have mentioned this - the for loop shall terminate when it touches any of the top or bottom barrier.

– Kok Wei Hoo
Nov 13 '18 at 15:01

Ultimately the expected outcome is to have a column of labels (1, -1, or 0) indicating that for each row, whether the Close price within the next 30 rows, first touches the top (1), or the bottom (-1), or it stays within the barriers throughout (0).

– Kok Wei Hoo
Nov 13 '18 at 15:15

What happens if a close prices is Above the Top_Barrier for one records and below the bottom barrier for another record within the thirty days? What do you want to return?

– Scott Boston
Nov 13 '18 at 14:42

Sorry, I should have mentioned this - the for loop shall terminate when it touches any of the top or bottom barrier.

– Kok Wei Hoo
Nov 13 '18 at 15:01

Ultimately the expected outcome is to have a column of labels (1, -1, or 0) indicating that for each row, whether the Close price within the next 30 rows, first touches the top (1), or the bottom (-1), or it stays within the barriers throughout (0).

– Kok Wei Hoo
Nov 13 '18 at 15:15

add a comment |

1 Answer
1

active

oldest

votes

You can try something like this if your dataset isn't very big:

df = df.reset_index().assign(key=1)

def f(x):
 cond1 = x['Close_x'] > x['Top_Barrier_y'].max()
 cond2 = x['Close_x'] < x['Bottom_Barrier_y'].min()
 return np.select([cond1,cond2],[1,-1], default=0)[0]

df.merge(df, on='key').query('index_y <= index_x').groupby('index_x').apply(f)

Output:

index_x
0 0
1 1
2 1
3 1
4 1
996 0
997 0
998 0
999 0
1000 0
dtype: int64

answered Nov 13 '18 at 15:21

Scott Boston

54.7k73056

The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...

– Kok Wei Hoo
Nov 13 '18 at 15:50

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53282558%2fhow-to-perform-rolling-for-loop-in-a-pandas-dataframe%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You can try something like this if your dataset isn't very big:

df = df.reset_index().assign(key=1)

def f(x):
 cond1 = x['Close_x'] > x['Top_Barrier_y'].max()
 cond2 = x['Close_x'] < x['Bottom_Barrier_y'].min()
 return np.select([cond1,cond2],[1,-1], default=0)[0]

df.merge(df, on='key').query('index_y <= index_x').groupby('index_x').apply(f)

Output:

index_x
0 0
1 1
2 1
3 1
4 1
996 0
997 0
998 0
999 0
1000 0
dtype: int64

answered Nov 13 '18 at 15:21

Scott Boston

54.7k73056

The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...

– Kok Wei Hoo
Nov 13 '18 at 15:50

add a comment |

You can try something like this if your dataset isn't very big:

df = df.reset_index().assign(key=1)

def f(x):
 cond1 = x['Close_x'] > x['Top_Barrier_y'].max()
 cond2 = x['Close_x'] < x['Bottom_Barrier_y'].min()
 return np.select([cond1,cond2],[1,-1], default=0)[0]

df.merge(df, on='key').query('index_y <= index_x').groupby('index_x').apply(f)

Output:

index_x
0 0
1 1
2 1
3 1
4 1
996 0
997 0
998 0
999 0
1000 0
dtype: int64

answered Nov 13 '18 at 15:21

Scott Boston

54.7k73056

The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...

– Kok Wei Hoo
Nov 13 '18 at 15:50

add a comment |

You can try something like this if your dataset isn't very big:

df = df.reset_index().assign(key=1)

def f(x):
 cond1 = x['Close_x'] > x['Top_Barrier_y'].max()
 cond2 = x['Close_x'] < x['Bottom_Barrier_y'].min()
 return np.select([cond1,cond2],[1,-1], default=0)[0]

df.merge(df, on='key').query('index_y <= index_x').groupby('index_x').apply(f)

Output:

index_x
0 0
1 1
2 1
3 1
4 1
996 0
997 0
998 0
999 0
1000 0
dtype: int64

answered Nov 13 '18 at 15:21

Scott Boston

54.7k73056

You can try something like this if your dataset isn't very big:

df = df.reset_index().assign(key=1)

def f(x):
 cond1 = x['Close_x'] > x['Top_Barrier_y'].max()
 cond2 = x['Close_x'] < x['Bottom_Barrier_y'].min()
 return np.select([cond1,cond2],[1,-1], default=0)[0]

df.merge(df, on='key').query('index_y <= index_x').groupby('index_x').apply(f)

Output:

index_x
0 0
1 1
2 1
3 1
4 1
996 0
997 0
998 0
999 0
1000 0
dtype: int64

answered Nov 13 '18 at 15:21

Scott Boston

54.7k73056

answered Nov 13 '18 at 15:21

Scott Boston

54.7k73056

answered Nov 13 '18 at 15:21

Scott Boston

54.7k73056

answered Nov 13 '18 at 15:21

Scott Boston

54.7k73056

The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...

– Kok Wei Hoo
Nov 13 '18 at 15:50

add a comment |

The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...

– Kok Wei Hoo
Nov 13 '18 at 15:50

The full df has about 24k rows. While I am still trying to understand how your method works, may I know if your method has the "rolling" effect? ie, .rolling(30)...

– Kok Wei Hoo
Nov 13 '18 at 15:50

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Pfthb