find_elements_by_xpath() not producing the desired output python selenium scraping










0















I'm trying to find a tr by its class of .tableOne. Here is my code:



browser = webdriver.Chrome(executable_path=path, options=options)
cells = browser.find_elements_by_xpath('//*[@class="tableone"]')


But the output of the cells variable is , an empty array.



Here is the html of the page:



<tbody class="tableUpper">
<tr class="tableone">
<td><a class="studentName" href="//www.abc.com"> student one</a></td>
<td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a></td>
<td class="hide-s">
<span class="state"></span> <span class="studentState">student_state</span>
</td>
</tr>
<tr class="tableone">..</tr>
<tr class="tableone">..</tr>
<tr class="tableone">..</tr>
<tr class="tableone">..</tr>
</tbody>









share|improve this question
























  • Hello @Praveen, could you include the html in your post as text instead of using a image? Also one thought, it might be because you need to wait for the page to load before getting elements off it but I'm not really familiar with selenium so I can't really say that's defiantly the problem.

    – Mike
    Nov 12 '18 at 10:54












  • <tbody class="tableUpper"> <tr class="tableone"> <td><a class="studentName" href="//www.abc.com"> student one</a></td> <td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a> </td> <td class="hide-s"><span class="state"></span> <span class="studentState">student_state</span> </td> </tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> </tbody>

    – Praveen
    Nov 12 '18 at 11:26











  • There is problem with editing the question, I am working in it, since I am new to this site, I have some troubles

    – Praveen
    Nov 12 '18 at 11:27











  • Ah, that's ok I've just edited your question so If you accept my edit I've put the code inline

    – Mike
    Nov 12 '18 at 11:38











  • Thankyou for editing the code, I have approved it. Hope to see the answer soon

    – Praveen
    Nov 12 '18 at 11:41















0















I'm trying to find a tr by its class of .tableOne. Here is my code:



browser = webdriver.Chrome(executable_path=path, options=options)
cells = browser.find_elements_by_xpath('//*[@class="tableone"]')


But the output of the cells variable is , an empty array.



Here is the html of the page:



<tbody class="tableUpper">
<tr class="tableone">
<td><a class="studentName" href="//www.abc.com"> student one</a></td>
<td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a></td>
<td class="hide-s">
<span class="state"></span> <span class="studentState">student_state</span>
</td>
</tr>
<tr class="tableone">..</tr>
<tr class="tableone">..</tr>
<tr class="tableone">..</tr>
<tr class="tableone">..</tr>
</tbody>









share|improve this question
























  • Hello @Praveen, could you include the html in your post as text instead of using a image? Also one thought, it might be because you need to wait for the page to load before getting elements off it but I'm not really familiar with selenium so I can't really say that's defiantly the problem.

    – Mike
    Nov 12 '18 at 10:54












  • <tbody class="tableUpper"> <tr class="tableone"> <td><a class="studentName" href="//www.abc.com"> student one</a></td> <td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a> </td> <td class="hide-s"><span class="state"></span> <span class="studentState">student_state</span> </td> </tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> </tbody>

    – Praveen
    Nov 12 '18 at 11:26











  • There is problem with editing the question, I am working in it, since I am new to this site, I have some troubles

    – Praveen
    Nov 12 '18 at 11:27











  • Ah, that's ok I've just edited your question so If you accept my edit I've put the code inline

    – Mike
    Nov 12 '18 at 11:38











  • Thankyou for editing the code, I have approved it. Hope to see the answer soon

    – Praveen
    Nov 12 '18 at 11:41













0












0








0








I'm trying to find a tr by its class of .tableOne. Here is my code:



browser = webdriver.Chrome(executable_path=path, options=options)
cells = browser.find_elements_by_xpath('//*[@class="tableone"]')


But the output of the cells variable is , an empty array.



Here is the html of the page:



<tbody class="tableUpper">
<tr class="tableone">
<td><a class="studentName" href="//www.abc.com"> student one</a></td>
<td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a></td>
<td class="hide-s">
<span class="state"></span> <span class="studentState">student_state</span>
</td>
</tr>
<tr class="tableone">..</tr>
<tr class="tableone">..</tr>
<tr class="tableone">..</tr>
<tr class="tableone">..</tr>
</tbody>









share|improve this question
















I'm trying to find a tr by its class of .tableOne. Here is my code:



browser = webdriver.Chrome(executable_path=path, options=options)
cells = browser.find_elements_by_xpath('//*[@class="tableone"]')


But the output of the cells variable is , an empty array.



Here is the html of the page:



<tbody class="tableUpper">
<tr class="tableone">
<td><a class="studentName" href="//www.abc.com"> student one</a></td>
<td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a></td>
<td class="hide-s">
<span class="state"></span> <span class="studentState">student_state</span>
</td>
</tr>
<tr class="tableone">..</tr>
<tr class="tableone">..</tr>
<tr class="tableone">..</tr>
<tr class="tableone">..</tr>
</tbody>






python-3.x selenium-webdriver xpath






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 12 '18 at 11:43







Praveen

















asked Nov 12 '18 at 10:44









PraveenPraveen

12




12












  • Hello @Praveen, could you include the html in your post as text instead of using a image? Also one thought, it might be because you need to wait for the page to load before getting elements off it but I'm not really familiar with selenium so I can't really say that's defiantly the problem.

    – Mike
    Nov 12 '18 at 10:54












  • <tbody class="tableUpper"> <tr class="tableone"> <td><a class="studentName" href="//www.abc.com"> student one</a></td> <td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a> </td> <td class="hide-s"><span class="state"></span> <span class="studentState">student_state</span> </td> </tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> </tbody>

    – Praveen
    Nov 12 '18 at 11:26











  • There is problem with editing the question, I am working in it, since I am new to this site, I have some troubles

    – Praveen
    Nov 12 '18 at 11:27











  • Ah, that's ok I've just edited your question so If you accept my edit I've put the code inline

    – Mike
    Nov 12 '18 at 11:38











  • Thankyou for editing the code, I have approved it. Hope to see the answer soon

    – Praveen
    Nov 12 '18 at 11:41

















  • Hello @Praveen, could you include the html in your post as text instead of using a image? Also one thought, it might be because you need to wait for the page to load before getting elements off it but I'm not really familiar with selenium so I can't really say that's defiantly the problem.

    – Mike
    Nov 12 '18 at 10:54












  • <tbody class="tableUpper"> <tr class="tableone"> <td><a class="studentName" href="//www.abc.com"> student one</a></td> <td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a> </td> <td class="hide-s"><span class="state"></span> <span class="studentState">student_state</span> </td> </tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> </tbody>

    – Praveen
    Nov 12 '18 at 11:26











  • There is problem with editing the question, I am working in it, since I am new to this site, I have some troubles

    – Praveen
    Nov 12 '18 at 11:27











  • Ah, that's ok I've just edited your question so If you accept my edit I've put the code inline

    – Mike
    Nov 12 '18 at 11:38











  • Thankyou for editing the code, I have approved it. Hope to see the answer soon

    – Praveen
    Nov 12 '18 at 11:41
















Hello @Praveen, could you include the html in your post as text instead of using a image? Also one thought, it might be because you need to wait for the page to load before getting elements off it but I'm not really familiar with selenium so I can't really say that's defiantly the problem.

– Mike
Nov 12 '18 at 10:54






Hello @Praveen, could you include the html in your post as text instead of using a image? Also one thought, it might be because you need to wait for the page to load before getting elements off it but I'm not really familiar with selenium so I can't really say that's defiantly the problem.

– Mike
Nov 12 '18 at 10:54














<tbody class="tableUpper"> <tr class="tableone"> <td><a class="studentName" href="//www.abc.com"> student one</a></td> <td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a> </td> <td class="hide-s"><span class="state"></span> <span class="studentState">student_state</span> </td> </tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> </tbody>

– Praveen
Nov 12 '18 at 11:26





<tbody class="tableUpper"> <tr class="tableone"> <td><a class="studentName" href="//www.abc.com"> student one</a></td> <td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a> </td> <td class="hide-s"><span class="state"></span> <span class="studentState">student_state</span> </td> </tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> </tbody>

– Praveen
Nov 12 '18 at 11:26













There is problem with editing the question, I am working in it, since I am new to this site, I have some troubles

– Praveen
Nov 12 '18 at 11:27





There is problem with editing the question, I am working in it, since I am new to this site, I have some troubles

– Praveen
Nov 12 '18 at 11:27













Ah, that's ok I've just edited your question so If you accept my edit I've put the code inline

– Mike
Nov 12 '18 at 11:38





Ah, that's ok I've just edited your question so If you accept my edit I've put the code inline

– Mike
Nov 12 '18 at 11:38













Thankyou for editing the code, I have approved it. Hope to see the answer soon

– Praveen
Nov 12 '18 at 11:41





Thankyou for editing the code, I have approved it. Hope to see the answer soon

– Praveen
Nov 12 '18 at 11:41












1 Answer
1






active

oldest

votes


















0














Please try this:



import re

cells = browser.find_elements_by_xpath("//*[contains(local-name(), 'tr') and contains(@class, 'tableone')]")

for (e in cells):
insides = e.find_elements_by_xpath("./td")
for (i in insides):
result = re.search('">(.*)</', i.get_attribute("outerHTML"))
print result.group(1)


What this does is gets all the tr elements that have class tableone, then iterates through each element and lists all the tds. Then iterates through the outerHTML of each td and strips each string to get the text value.
It's quite unrefined and will return empty strings, I think. You might need to put some more work into the final product.






share|improve this answer
























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260482%2ffind-elements-by-xpath-not-producing-the-desired-output-python-selenium-scrapi%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Please try this:



    import re

    cells = browser.find_elements_by_xpath("//*[contains(local-name(), 'tr') and contains(@class, 'tableone')]")

    for (e in cells):
    insides = e.find_elements_by_xpath("./td")
    for (i in insides):
    result = re.search('">(.*)</', i.get_attribute("outerHTML"))
    print result.group(1)


    What this does is gets all the tr elements that have class tableone, then iterates through each element and lists all the tds. Then iterates through the outerHTML of each td and strips each string to get the text value.
    It's quite unrefined and will return empty strings, I think. You might need to put some more work into the final product.






    share|improve this answer





























      0














      Please try this:



      import re

      cells = browser.find_elements_by_xpath("//*[contains(local-name(), 'tr') and contains(@class, 'tableone')]")

      for (e in cells):
      insides = e.find_elements_by_xpath("./td")
      for (i in insides):
      result = re.search('">(.*)</', i.get_attribute("outerHTML"))
      print result.group(1)


      What this does is gets all the tr elements that have class tableone, then iterates through each element and lists all the tds. Then iterates through the outerHTML of each td and strips each string to get the text value.
      It's quite unrefined and will return empty strings, I think. You might need to put some more work into the final product.






      share|improve this answer



























        0












        0








        0







        Please try this:



        import re

        cells = browser.find_elements_by_xpath("//*[contains(local-name(), 'tr') and contains(@class, 'tableone')]")

        for (e in cells):
        insides = e.find_elements_by_xpath("./td")
        for (i in insides):
        result = re.search('">(.*)</', i.get_attribute("outerHTML"))
        print result.group(1)


        What this does is gets all the tr elements that have class tableone, then iterates through each element and lists all the tds. Then iterates through the outerHTML of each td and strips each string to get the text value.
        It's quite unrefined and will return empty strings, I think. You might need to put some more work into the final product.






        share|improve this answer















        Please try this:



        import re

        cells = browser.find_elements_by_xpath("//*[contains(local-name(), 'tr') and contains(@class, 'tableone')]")

        for (e in cells):
        insides = e.find_elements_by_xpath("./td")
        for (i in insides):
        result = re.search('">(.*)</', i.get_attribute("outerHTML"))
        print result.group(1)


        What this does is gets all the tr elements that have class tableone, then iterates through each element and lists all the tds. Then iterates through the outerHTML of each td and strips each string to get the text value.
        It's quite unrefined and will return empty strings, I think. You might need to put some more work into the final product.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 13 '18 at 13:29

























        answered Nov 13 '18 at 13:16









        AlichinoAlichino

        7301617




        7301617



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260482%2ffind-elements-by-xpath-not-producing-the-desired-output-python-selenium-scrapi%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Use pre created SQLite database for Android project in kotlin

            Darth Vader #20

            Ondo