Looping over an error to scrape a page in Python
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I'd like to run an 'if is error' do something 'else' do something else loop in Python.
This is a general question, but in my particular application, I'm scraping information from a web page; navigating to the next page; and repeating the loop until there are no more pages left to scrape. So the terminating condition is an error telling me there are no more pages left.
For example:
no_more_pages = False
while no_more_pages == False:
if link[-1].find('a')['href'] is False:
no_more_pages = True
else:
current_link = link[-1].find('a')['href']
Obviously the syntax here is wrong. If someone could point me in the right direction, that'd be very helpful.
python loops
add a comment |
I'd like to run an 'if is error' do something 'else' do something else loop in Python.
This is a general question, but in my particular application, I'm scraping information from a web page; navigating to the next page; and repeating the loop until there are no more pages left to scrape. So the terminating condition is an error telling me there are no more pages left.
For example:
no_more_pages = False
while no_more_pages == False:
if link[-1].find('a')['href'] is False:
no_more_pages = True
else:
current_link = link[-1].find('a')['href']
Obviously the syntax here is wrong. If someone could point me in the right direction, that'd be very helpful.
python loops
The syntax is valid, there is no loop and no error. What problem are you trying to solve?
– MisterMiyagi
Nov 15 '18 at 12:52
@MisterMiagi hopefully my edits have clarified the issue
– Barton
Nov 15 '18 at 13:19
add a comment |
I'd like to run an 'if is error' do something 'else' do something else loop in Python.
This is a general question, but in my particular application, I'm scraping information from a web page; navigating to the next page; and repeating the loop until there are no more pages left to scrape. So the terminating condition is an error telling me there are no more pages left.
For example:
no_more_pages = False
while no_more_pages == False:
if link[-1].find('a')['href'] is False:
no_more_pages = True
else:
current_link = link[-1].find('a')['href']
Obviously the syntax here is wrong. If someone could point me in the right direction, that'd be very helpful.
python loops
I'd like to run an 'if is error' do something 'else' do something else loop in Python.
This is a general question, but in my particular application, I'm scraping information from a web page; navigating to the next page; and repeating the loop until there are no more pages left to scrape. So the terminating condition is an error telling me there are no more pages left.
For example:
no_more_pages = False
while no_more_pages == False:
if link[-1].find('a')['href'] is False:
no_more_pages = True
else:
current_link = link[-1].find('a')['href']
Obviously the syntax here is wrong. If someone could point me in the right direction, that'd be very helpful.
python loops
python loops
edited Nov 15 '18 at 13:18
Barton
asked Nov 15 '18 at 12:48
BartonBarton
185
185
The syntax is valid, there is no loop and no error. What problem are you trying to solve?
– MisterMiyagi
Nov 15 '18 at 12:52
@MisterMiagi hopefully my edits have clarified the issue
– Barton
Nov 15 '18 at 13:19
add a comment |
The syntax is valid, there is no loop and no error. What problem are you trying to solve?
– MisterMiyagi
Nov 15 '18 at 12:52
@MisterMiagi hopefully my edits have clarified the issue
– Barton
Nov 15 '18 at 13:19
The syntax is valid, there is no loop and no error. What problem are you trying to solve?
– MisterMiyagi
Nov 15 '18 at 12:52
The syntax is valid, there is no loop and no error. What problem are you trying to solve?
– MisterMiyagi
Nov 15 '18 at 12:52
@MisterMiagi hopefully my edits have clarified the issue
– Barton
Nov 15 '18 at 13:19
@MisterMiagi hopefully my edits have clarified the issue
– Barton
Nov 15 '18 at 13:19
add a comment |
2 Answers
2
active
oldest
votes
Not sure I completely get your question but you may try something like a try except block. If there's any errors caused by the conditions of the if statement, you can place the code inside a try block and in case of any error, it will execute the except clause:
try:
if not link[-1].find('a')['href'] is False:
current_link = link[-1].find('a')['href']
except:
no_more_pages = True
If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.
– Muhammad Ahmad
Nov 15 '18 at 13:06
@specbug this is exactly what I was after. Many thanks and sorry for my confusing question
– Barton
Nov 15 '18 at 13:12
@Barton sure just mark the answer correct so others won't think I was just spitballing here!
– specbug
Nov 15 '18 at 13:14
add a comment |
Looks like you're using beautiful soup. In that case... you can use .has_attr('href')
if you're trying to check for the href attribute which seems odd... If you just want to check if a link is present just check if the child tag exists on the "link" you have defined.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53319838%2flooping-over-an-error-to-scrape-a-page-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Not sure I completely get your question but you may try something like a try except block. If there's any errors caused by the conditions of the if statement, you can place the code inside a try block and in case of any error, it will execute the except clause:
try:
if not link[-1].find('a')['href'] is False:
current_link = link[-1].find('a')['href']
except:
no_more_pages = True
If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.
– Muhammad Ahmad
Nov 15 '18 at 13:06
@specbug this is exactly what I was after. Many thanks and sorry for my confusing question
– Barton
Nov 15 '18 at 13:12
@Barton sure just mark the answer correct so others won't think I was just spitballing here!
– specbug
Nov 15 '18 at 13:14
add a comment |
Not sure I completely get your question but you may try something like a try except block. If there's any errors caused by the conditions of the if statement, you can place the code inside a try block and in case of any error, it will execute the except clause:
try:
if not link[-1].find('a')['href'] is False:
current_link = link[-1].find('a')['href']
except:
no_more_pages = True
If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.
– Muhammad Ahmad
Nov 15 '18 at 13:06
@specbug this is exactly what I was after. Many thanks and sorry for my confusing question
– Barton
Nov 15 '18 at 13:12
@Barton sure just mark the answer correct so others won't think I was just spitballing here!
– specbug
Nov 15 '18 at 13:14
add a comment |
Not sure I completely get your question but you may try something like a try except block. If there's any errors caused by the conditions of the if statement, you can place the code inside a try block and in case of any error, it will execute the except clause:
try:
if not link[-1].find('a')['href'] is False:
current_link = link[-1].find('a')['href']
except:
no_more_pages = True
Not sure I completely get your question but you may try something like a try except block. If there's any errors caused by the conditions of the if statement, you can place the code inside a try block and in case of any error, it will execute the except clause:
try:
if not link[-1].find('a')['href'] is False:
current_link = link[-1].find('a')['href']
except:
no_more_pages = True
answered Nov 15 '18 at 13:01
specbugspecbug
310310
310310
If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.
– Muhammad Ahmad
Nov 15 '18 at 13:06
@specbug this is exactly what I was after. Many thanks and sorry for my confusing question
– Barton
Nov 15 '18 at 13:12
@Barton sure just mark the answer correct so others won't think I was just spitballing here!
– specbug
Nov 15 '18 at 13:14
add a comment |
If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.
– Muhammad Ahmad
Nov 15 '18 at 13:06
@specbug this is exactly what I was after. Many thanks and sorry for my confusing question
– Barton
Nov 15 '18 at 13:12
@Barton sure just mark the answer correct so others won't think I was just spitballing here!
– specbug
Nov 15 '18 at 13:14
If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.
– Muhammad Ahmad
Nov 15 '18 at 13:06
If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.
– Muhammad Ahmad
Nov 15 '18 at 13:06
@specbug this is exactly what I was after. Many thanks and sorry for my confusing question
– Barton
Nov 15 '18 at 13:12
@specbug this is exactly what I was after. Many thanks and sorry for my confusing question
– Barton
Nov 15 '18 at 13:12
@Barton sure just mark the answer correct so others won't think I was just spitballing here!
– specbug
Nov 15 '18 at 13:14
@Barton sure just mark the answer correct so others won't think I was just spitballing here!
– specbug
Nov 15 '18 at 13:14
add a comment |
Looks like you're using beautiful soup. In that case... you can use .has_attr('href')
if you're trying to check for the href attribute which seems odd... If you just want to check if a link is present just check if the child tag exists on the "link" you have defined.
add a comment |
Looks like you're using beautiful soup. In that case... you can use .has_attr('href')
if you're trying to check for the href attribute which seems odd... If you just want to check if a link is present just check if the child tag exists on the "link" you have defined.
add a comment |
Looks like you're using beautiful soup. In that case... you can use .has_attr('href')
if you're trying to check for the href attribute which seems odd... If you just want to check if a link is present just check if the child tag exists on the "link" you have defined.
Looks like you're using beautiful soup. In that case... you can use .has_attr('href')
if you're trying to check for the href attribute which seems odd... If you just want to check if a link is present just check if the child tag exists on the "link" you have defined.
answered Nov 15 '18 at 12:53
PythonistaPythonista
8,89721438
8,89721438
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53319838%2flooping-over-an-error-to-scrape-a-page-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
The syntax is valid, there is no loop and no error. What problem are you trying to solve?
– MisterMiyagi
Nov 15 '18 at 12:52
@MisterMiagi hopefully my edits have clarified the issue
– Barton
Nov 15 '18 at 13:19