Looping over an error to scrape a page in Python



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








0















I'd like to run an 'if is error' do something 'else' do something else loop in Python.



This is a general question, but in my particular application, I'm scraping information from a web page; navigating to the next page; and repeating the loop until there are no more pages left to scrape. So the terminating condition is an error telling me there are no more pages left.



For example:



no_more_pages = False
while no_more_pages == False:
if link[-1].find('a')['href'] is False:
no_more_pages = True
else:
current_link = link[-1].find('a')['href']


Obviously the syntax here is wrong. If someone could point me in the right direction, that'd be very helpful.










share|improve this question
























  • The syntax is valid, there is no loop and no error. What problem are you trying to solve?

    – MisterMiyagi
    Nov 15 '18 at 12:52












  • @MisterMiagi hopefully my edits have clarified the issue

    – Barton
    Nov 15 '18 at 13:19

















0















I'd like to run an 'if is error' do something 'else' do something else loop in Python.



This is a general question, but in my particular application, I'm scraping information from a web page; navigating to the next page; and repeating the loop until there are no more pages left to scrape. So the terminating condition is an error telling me there are no more pages left.



For example:



no_more_pages = False
while no_more_pages == False:
if link[-1].find('a')['href'] is False:
no_more_pages = True
else:
current_link = link[-1].find('a')['href']


Obviously the syntax here is wrong. If someone could point me in the right direction, that'd be very helpful.










share|improve this question
























  • The syntax is valid, there is no loop and no error. What problem are you trying to solve?

    – MisterMiyagi
    Nov 15 '18 at 12:52












  • @MisterMiagi hopefully my edits have clarified the issue

    – Barton
    Nov 15 '18 at 13:19













0












0








0


0






I'd like to run an 'if is error' do something 'else' do something else loop in Python.



This is a general question, but in my particular application, I'm scraping information from a web page; navigating to the next page; and repeating the loop until there are no more pages left to scrape. So the terminating condition is an error telling me there are no more pages left.



For example:



no_more_pages = False
while no_more_pages == False:
if link[-1].find('a')['href'] is False:
no_more_pages = True
else:
current_link = link[-1].find('a')['href']


Obviously the syntax here is wrong. If someone could point me in the right direction, that'd be very helpful.










share|improve this question
















I'd like to run an 'if is error' do something 'else' do something else loop in Python.



This is a general question, but in my particular application, I'm scraping information from a web page; navigating to the next page; and repeating the loop until there are no more pages left to scrape. So the terminating condition is an error telling me there are no more pages left.



For example:



no_more_pages = False
while no_more_pages == False:
if link[-1].find('a')['href'] is False:
no_more_pages = True
else:
current_link = link[-1].find('a')['href']


Obviously the syntax here is wrong. If someone could point me in the right direction, that'd be very helpful.







python loops






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 15 '18 at 13:18







Barton

















asked Nov 15 '18 at 12:48









BartonBarton

185




185












  • The syntax is valid, there is no loop and no error. What problem are you trying to solve?

    – MisterMiyagi
    Nov 15 '18 at 12:52












  • @MisterMiagi hopefully my edits have clarified the issue

    – Barton
    Nov 15 '18 at 13:19

















  • The syntax is valid, there is no loop and no error. What problem are you trying to solve?

    – MisterMiyagi
    Nov 15 '18 at 12:52












  • @MisterMiagi hopefully my edits have clarified the issue

    – Barton
    Nov 15 '18 at 13:19
















The syntax is valid, there is no loop and no error. What problem are you trying to solve?

– MisterMiyagi
Nov 15 '18 at 12:52






The syntax is valid, there is no loop and no error. What problem are you trying to solve?

– MisterMiyagi
Nov 15 '18 at 12:52














@MisterMiagi hopefully my edits have clarified the issue

– Barton
Nov 15 '18 at 13:19





@MisterMiagi hopefully my edits have clarified the issue

– Barton
Nov 15 '18 at 13:19












2 Answers
2






active

oldest

votes


















0














Not sure I completely get your question but you may try something like a try except block. If there's any errors caused by the conditions of the if statement, you can place the code inside a try block and in case of any error, it will execute the except clause:



try:
if not link[-1].find('a')['href'] is False:
current_link = link[-1].find('a')['href']
except:
no_more_pages = True





share|improve this answer























  • If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.

    – Muhammad Ahmad
    Nov 15 '18 at 13:06











  • @specbug this is exactly what I was after. Many thanks and sorry for my confusing question

    – Barton
    Nov 15 '18 at 13:12











  • @Barton sure just mark the answer correct so others won't think I was just spitballing here!

    – specbug
    Nov 15 '18 at 13:14


















0














Looks like you're using beautiful soup. In that case... you can use .has_attr('href') if you're trying to check for the href attribute which seems odd... If you just want to check if a link is present just check if the child tag exists on the "link" you have defined.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53319838%2flooping-over-an-error-to-scrape-a-page-in-python%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Not sure I completely get your question but you may try something like a try except block. If there's any errors caused by the conditions of the if statement, you can place the code inside a try block and in case of any error, it will execute the except clause:



    try:
    if not link[-1].find('a')['href'] is False:
    current_link = link[-1].find('a')['href']
    except:
    no_more_pages = True





    share|improve this answer























    • If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.

      – Muhammad Ahmad
      Nov 15 '18 at 13:06











    • @specbug this is exactly what I was after. Many thanks and sorry for my confusing question

      – Barton
      Nov 15 '18 at 13:12











    • @Barton sure just mark the answer correct so others won't think I was just spitballing here!

      – specbug
      Nov 15 '18 at 13:14















    0














    Not sure I completely get your question but you may try something like a try except block. If there's any errors caused by the conditions of the if statement, you can place the code inside a try block and in case of any error, it will execute the except clause:



    try:
    if not link[-1].find('a')['href'] is False:
    current_link = link[-1].find('a')['href']
    except:
    no_more_pages = True





    share|improve this answer























    • If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.

      – Muhammad Ahmad
      Nov 15 '18 at 13:06











    • @specbug this is exactly what I was after. Many thanks and sorry for my confusing question

      – Barton
      Nov 15 '18 at 13:12











    • @Barton sure just mark the answer correct so others won't think I was just spitballing here!

      – specbug
      Nov 15 '18 at 13:14













    0












    0








    0







    Not sure I completely get your question but you may try something like a try except block. If there's any errors caused by the conditions of the if statement, you can place the code inside a try block and in case of any error, it will execute the except clause:



    try:
    if not link[-1].find('a')['href'] is False:
    current_link = link[-1].find('a')['href']
    except:
    no_more_pages = True





    share|improve this answer













    Not sure I completely get your question but you may try something like a try except block. If there's any errors caused by the conditions of the if statement, you can place the code inside a try block and in case of any error, it will execute the except clause:



    try:
    if not link[-1].find('a')['href'] is False:
    current_link = link[-1].find('a')['href']
    except:
    no_more_pages = True






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 15 '18 at 13:01









    specbugspecbug

    310310




    310310












    • If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.

      – Muhammad Ahmad
      Nov 15 '18 at 13:06











    • @specbug this is exactly what I was after. Many thanks and sorry for my confusing question

      – Barton
      Nov 15 '18 at 13:12











    • @Barton sure just mark the answer correct so others won't think I was just spitballing here!

      – specbug
      Nov 15 '18 at 13:14

















    • If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.

      – Muhammad Ahmad
      Nov 15 '18 at 13:06











    • @specbug this is exactly what I was after. Many thanks and sorry for my confusing question

      – Barton
      Nov 15 '18 at 13:12











    • @Barton sure just mark the answer correct so others won't think I was just spitballing here!

      – specbug
      Nov 15 '18 at 13:14
















    If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.

    – Muhammad Ahmad
    Nov 15 '18 at 13:06





    If you're not completely sure, you shouldn't give an answer. Instead, you should first ask OP questions to clarify what's bothering you. And then answer.

    – Muhammad Ahmad
    Nov 15 '18 at 13:06













    @specbug this is exactly what I was after. Many thanks and sorry for my confusing question

    – Barton
    Nov 15 '18 at 13:12





    @specbug this is exactly what I was after. Many thanks and sorry for my confusing question

    – Barton
    Nov 15 '18 at 13:12













    @Barton sure just mark the answer correct so others won't think I was just spitballing here!

    – specbug
    Nov 15 '18 at 13:14





    @Barton sure just mark the answer correct so others won't think I was just spitballing here!

    – specbug
    Nov 15 '18 at 13:14













    0














    Looks like you're using beautiful soup. In that case... you can use .has_attr('href') if you're trying to check for the href attribute which seems odd... If you just want to check if a link is present just check if the child tag exists on the "link" you have defined.






    share|improve this answer



























      0














      Looks like you're using beautiful soup. In that case... you can use .has_attr('href') if you're trying to check for the href attribute which seems odd... If you just want to check if a link is present just check if the child tag exists on the "link" you have defined.






      share|improve this answer

























        0












        0








        0







        Looks like you're using beautiful soup. In that case... you can use .has_attr('href') if you're trying to check for the href attribute which seems odd... If you just want to check if a link is present just check if the child tag exists on the "link" you have defined.






        share|improve this answer













        Looks like you're using beautiful soup. In that case... you can use .has_attr('href') if you're trying to check for the href attribute which seems odd... If you just want to check if a link is present just check if the child tag exists on the "link" you have defined.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 15 '18 at 12:53









        PythonistaPythonista

        8,89721438




        8,89721438



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53319838%2flooping-over-an-error-to-scrape-a-page-in-python%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to how show current date and time by default on contact form 7 in WordPress without taking input from user in datetimepicker

            Syphilis

            Darth Vader #20