how to merge data frames generated in a while loop on a key
I have a while loop, and a dataframe is generated each iteration.
I want to merge dataframes after every iteration on a key (let's say column id
):
while i < 600:
try:
player_html = urlopen("https://fantasy.premierleague.com/drf/element-summary/" + str(i))
player_raw = json.load(player_html)
fixture = player_raw['fixtures']
data_df = pd.DataFrame(fixture)
new_column = data_df.columns
new_df = pd.DataFrame(columns=new_column)
new_df = new_df.merge(data_df, on='id')
except:
#Write all of the numbers for which there was errors to a file
errfile = open(player_error, "a")
errfile.write(str(i) + "n")
pass
print (i)
i += 1
return new_df
This was my logic, but it is not working. How can I fix this? Thanks.
python dataframe while-loop merge
add a comment |
I have a while loop, and a dataframe is generated each iteration.
I want to merge dataframes after every iteration on a key (let's say column id
):
while i < 600:
try:
player_html = urlopen("https://fantasy.premierleague.com/drf/element-summary/" + str(i))
player_raw = json.load(player_html)
fixture = player_raw['fixtures']
data_df = pd.DataFrame(fixture)
new_column = data_df.columns
new_df = pd.DataFrame(columns=new_column)
new_df = new_df.merge(data_df, on='id')
except:
#Write all of the numbers for which there was errors to a file
errfile = open(player_error, "a")
errfile.write(str(i) + "n")
pass
print (i)
i += 1
return new_df
This was my logic, but it is not working. How can I fix this? Thanks.
python dataframe while-loop merge
add a comment |
I have a while loop, and a dataframe is generated each iteration.
I want to merge dataframes after every iteration on a key (let's say column id
):
while i < 600:
try:
player_html = urlopen("https://fantasy.premierleague.com/drf/element-summary/" + str(i))
player_raw = json.load(player_html)
fixture = player_raw['fixtures']
data_df = pd.DataFrame(fixture)
new_column = data_df.columns
new_df = pd.DataFrame(columns=new_column)
new_df = new_df.merge(data_df, on='id')
except:
#Write all of the numbers for which there was errors to a file
errfile = open(player_error, "a")
errfile.write(str(i) + "n")
pass
print (i)
i += 1
return new_df
This was my logic, but it is not working. How can I fix this? Thanks.
python dataframe while-loop merge
I have a while loop, and a dataframe is generated each iteration.
I want to merge dataframes after every iteration on a key (let's say column id
):
while i < 600:
try:
player_html = urlopen("https://fantasy.premierleague.com/drf/element-summary/" + str(i))
player_raw = json.load(player_html)
fixture = player_raw['fixtures']
data_df = pd.DataFrame(fixture)
new_column = data_df.columns
new_df = pd.DataFrame(columns=new_column)
new_df = new_df.merge(data_df, on='id')
except:
#Write all of the numbers for which there was errors to a file
errfile = open(player_error, "a")
errfile.write(str(i) + "n")
pass
print (i)
i += 1
return new_df
This was my logic, but it is not working. How can I fix this? Thanks.
python dataframe while-loop merge
python dataframe while-loop merge
edited Nov 14 '18 at 5:34
Joel
1,5686719
1,5686719
asked Nov 14 '18 at 3:41
Yun Tae HwangYun Tae Hwang
18112
18112
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
My guess is that data_df
should be the initial batch of df that any consequent new_df
must be appended.
However, as it is inside the while loop, it's keep on getting reset.
That being said, assigning data_df before the loop should do the job.
data = pandas.read_excel(*******)
data_df = pd.DataFrame(data)
while i < 600:
try:
new_column = data_df.columns
new_df = pd.DataFrame(columns=new_column)
new_df = new_df.merge(data_df, on='id')
except:
#Write all of the numbers for which there was errors to a file
errfile = open(player_error, "a")
errfile.write(str(i) + "n")
pass
print (i)
i += 1
In addition,
pandas.read_excel
returnsDataFrame
so the second line could be redundant.- Not sure how
pandas
is imported but if it wasimport pandas as pd
, do use pd only (i.e. first line)
Actually I edited my question. Since I was doing webscraping ( i used pd.read_excel to make the question more simple), I have to have it in the loop to go through the [i]
– Yun Tae Hwang
Nov 14 '18 at 4:52
Regardless of what you try to read, it still doesn't change that initialpd.DataFrame
must be assigned outside the loop. You are mergingdata_df
tonew_df
butnew_df = pd.DataFrame(columns=new_column)
is replacing merged df with new, empty (only header) dataframe.
– Chris
Nov 14 '18 at 5:11
I understand that, but reading or getting information itself is a looping process. While loop is needed for scraping the info too.
– Yun Tae Hwang
Nov 14 '18 at 12:43
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53292878%2fhow-to-merge-data-frames-generated-in-a-while-loop-on-a-key%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
My guess is that data_df
should be the initial batch of df that any consequent new_df
must be appended.
However, as it is inside the while loop, it's keep on getting reset.
That being said, assigning data_df before the loop should do the job.
data = pandas.read_excel(*******)
data_df = pd.DataFrame(data)
while i < 600:
try:
new_column = data_df.columns
new_df = pd.DataFrame(columns=new_column)
new_df = new_df.merge(data_df, on='id')
except:
#Write all of the numbers for which there was errors to a file
errfile = open(player_error, "a")
errfile.write(str(i) + "n")
pass
print (i)
i += 1
In addition,
pandas.read_excel
returnsDataFrame
so the second line could be redundant.- Not sure how
pandas
is imported but if it wasimport pandas as pd
, do use pd only (i.e. first line)
Actually I edited my question. Since I was doing webscraping ( i used pd.read_excel to make the question more simple), I have to have it in the loop to go through the [i]
– Yun Tae Hwang
Nov 14 '18 at 4:52
Regardless of what you try to read, it still doesn't change that initialpd.DataFrame
must be assigned outside the loop. You are mergingdata_df
tonew_df
butnew_df = pd.DataFrame(columns=new_column)
is replacing merged df with new, empty (only header) dataframe.
– Chris
Nov 14 '18 at 5:11
I understand that, but reading or getting information itself is a looping process. While loop is needed for scraping the info too.
– Yun Tae Hwang
Nov 14 '18 at 12:43
add a comment |
My guess is that data_df
should be the initial batch of df that any consequent new_df
must be appended.
However, as it is inside the while loop, it's keep on getting reset.
That being said, assigning data_df before the loop should do the job.
data = pandas.read_excel(*******)
data_df = pd.DataFrame(data)
while i < 600:
try:
new_column = data_df.columns
new_df = pd.DataFrame(columns=new_column)
new_df = new_df.merge(data_df, on='id')
except:
#Write all of the numbers for which there was errors to a file
errfile = open(player_error, "a")
errfile.write(str(i) + "n")
pass
print (i)
i += 1
In addition,
pandas.read_excel
returnsDataFrame
so the second line could be redundant.- Not sure how
pandas
is imported but if it wasimport pandas as pd
, do use pd only (i.e. first line)
Actually I edited my question. Since I was doing webscraping ( i used pd.read_excel to make the question more simple), I have to have it in the loop to go through the [i]
– Yun Tae Hwang
Nov 14 '18 at 4:52
Regardless of what you try to read, it still doesn't change that initialpd.DataFrame
must be assigned outside the loop. You are mergingdata_df
tonew_df
butnew_df = pd.DataFrame(columns=new_column)
is replacing merged df with new, empty (only header) dataframe.
– Chris
Nov 14 '18 at 5:11
I understand that, but reading or getting information itself is a looping process. While loop is needed for scraping the info too.
– Yun Tae Hwang
Nov 14 '18 at 12:43
add a comment |
My guess is that data_df
should be the initial batch of df that any consequent new_df
must be appended.
However, as it is inside the while loop, it's keep on getting reset.
That being said, assigning data_df before the loop should do the job.
data = pandas.read_excel(*******)
data_df = pd.DataFrame(data)
while i < 600:
try:
new_column = data_df.columns
new_df = pd.DataFrame(columns=new_column)
new_df = new_df.merge(data_df, on='id')
except:
#Write all of the numbers for which there was errors to a file
errfile = open(player_error, "a")
errfile.write(str(i) + "n")
pass
print (i)
i += 1
In addition,
pandas.read_excel
returnsDataFrame
so the second line could be redundant.- Not sure how
pandas
is imported but if it wasimport pandas as pd
, do use pd only (i.e. first line)
My guess is that data_df
should be the initial batch of df that any consequent new_df
must be appended.
However, as it is inside the while loop, it's keep on getting reset.
That being said, assigning data_df before the loop should do the job.
data = pandas.read_excel(*******)
data_df = pd.DataFrame(data)
while i < 600:
try:
new_column = data_df.columns
new_df = pd.DataFrame(columns=new_column)
new_df = new_df.merge(data_df, on='id')
except:
#Write all of the numbers for which there was errors to a file
errfile = open(player_error, "a")
errfile.write(str(i) + "n")
pass
print (i)
i += 1
In addition,
pandas.read_excel
returnsDataFrame
so the second line could be redundant.- Not sure how
pandas
is imported but if it wasimport pandas as pd
, do use pd only (i.e. first line)
answered Nov 14 '18 at 3:57
ChrisChris
2,752320
2,752320
Actually I edited my question. Since I was doing webscraping ( i used pd.read_excel to make the question more simple), I have to have it in the loop to go through the [i]
– Yun Tae Hwang
Nov 14 '18 at 4:52
Regardless of what you try to read, it still doesn't change that initialpd.DataFrame
must be assigned outside the loop. You are mergingdata_df
tonew_df
butnew_df = pd.DataFrame(columns=new_column)
is replacing merged df with new, empty (only header) dataframe.
– Chris
Nov 14 '18 at 5:11
I understand that, but reading or getting information itself is a looping process. While loop is needed for scraping the info too.
– Yun Tae Hwang
Nov 14 '18 at 12:43
add a comment |
Actually I edited my question. Since I was doing webscraping ( i used pd.read_excel to make the question more simple), I have to have it in the loop to go through the [i]
– Yun Tae Hwang
Nov 14 '18 at 4:52
Regardless of what you try to read, it still doesn't change that initialpd.DataFrame
must be assigned outside the loop. You are mergingdata_df
tonew_df
butnew_df = pd.DataFrame(columns=new_column)
is replacing merged df with new, empty (only header) dataframe.
– Chris
Nov 14 '18 at 5:11
I understand that, but reading or getting information itself is a looping process. While loop is needed for scraping the info too.
– Yun Tae Hwang
Nov 14 '18 at 12:43
Actually I edited my question. Since I was doing webscraping ( i used pd.read_excel to make the question more simple), I have to have it in the loop to go through the [i]
– Yun Tae Hwang
Nov 14 '18 at 4:52
Actually I edited my question. Since I was doing webscraping ( i used pd.read_excel to make the question more simple), I have to have it in the loop to go through the [i]
– Yun Tae Hwang
Nov 14 '18 at 4:52
Regardless of what you try to read, it still doesn't change that initial
pd.DataFrame
must be assigned outside the loop. You are merging data_df
to new_df
but new_df = pd.DataFrame(columns=new_column)
is replacing merged df with new, empty (only header) dataframe.– Chris
Nov 14 '18 at 5:11
Regardless of what you try to read, it still doesn't change that initial
pd.DataFrame
must be assigned outside the loop. You are merging data_df
to new_df
but new_df = pd.DataFrame(columns=new_column)
is replacing merged df with new, empty (only header) dataframe.– Chris
Nov 14 '18 at 5:11
I understand that, but reading or getting information itself is a looping process. While loop is needed for scraping the info too.
– Yun Tae Hwang
Nov 14 '18 at 12:43
I understand that, but reading or getting information itself is a looping process. While loop is needed for scraping the info too.
– Yun Tae Hwang
Nov 14 '18 at 12:43
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53292878%2fhow-to-merge-data-frames-generated-in-a-while-loop-on-a-key%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown