find_elements_by_xpath() not producing the desired output python selenium scraping

I'm trying to find a tr by its class of .tableOne. Here is my code:

browser = webdriver.Chrome(executable_path=path, options=options)
cells = browser.find_elements_by_xpath('//*[@class="tableone"]')

But the output of the cells variable is , an empty array.

Here is the html of the page:

<tbody class="tableUpper">
 <tr class="tableone">
 <td><a class="studentName" href="//www.abc.com"> student one</a></td>
 <td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a></td>
 <td class="hide-s">
 <span class="state"></span> <span class="studentState">student_state</span>
 </td>
 </tr>
 <tr class="tableone">..</tr>
 <tr class="tableone">..</tr>
 <tr class="tableone">..</tr>
 <tr class="tableone">..</tr>
</tbody>

edited Nov 12 '18 at 11:43

asked Nov 12 '18 at 10:44

Praveen

Hello @Praveen, could you include the html in your post as text instead of using a image? Also one thought, it might be because you need to wait for the page to load before getting elements off it but I'm not really familiar with selenium so I can't really say that's defiantly the problem.

– Mike
Nov 12 '18 at 10:54

<tbody class="tableUpper"> <tr class="tableone"> <td><a class="studentName" href="//www.abc.com"> student one</a></td> <td><a href="//www.abc.com/overview"> Place Place</a> </td> <td class="hide-s"> student_state </td> </tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> </tbody>

– Praveen
Nov 12 '18 at 11:26

There is problem with editing the question, I am working in it, since I am new to this site, I have some troubles

– Praveen
Nov 12 '18 at 11:27

Ah, that's ok I've just edited your question so If you accept my edit I've put the code inline

– Mike
Nov 12 '18 at 11:38

Thankyou for editing the code, I have approved it. Hope to see the answer soon

– Praveen
Nov 12 '18 at 11:41

|
show 4 more comments

I'm trying to find a tr by its class of .tableOne. Here is my code:

browser = webdriver.Chrome(executable_path=path, options=options)
cells = browser.find_elements_by_xpath('//*[@class="tableone"]')

But the output of the cells variable is , an empty array.

Here is the html of the page:

<tbody class="tableUpper">
 <tr class="tableone">
 <td><a class="studentName" href="//www.abc.com"> student one</a></td>
 <td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a></td>
 <td class="hide-s">
 <span class="state"></span> <span class="studentState">student_state</span>
 </td>
 </tr>
 <tr class="tableone">..</tr>
 <tr class="tableone">..</tr>
 <tr class="tableone">..</tr>
 <tr class="tableone">..</tr>
</tbody>

edited Nov 12 '18 at 11:43

asked Nov 12 '18 at 10:44

Praveen

Hello @Praveen, could you include the html in your post as text instead of using a image? Also one thought, it might be because you need to wait for the page to load before getting elements off it but I'm not really familiar with selenium so I can't really say that's defiantly the problem.

– Mike
Nov 12 '18 at 10:54

<tbody class="tableUpper"> <tr class="tableone"> <td><a class="studentName" href="//www.abc.com"> student one</a></td> <td><a href="//www.abc.com/overview"> Place Place</a> </td> <td class="hide-s"> student_state </td> </tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> </tbody>

– Praveen
Nov 12 '18 at 11:26

There is problem with editing the question, I am working in it, since I am new to this site, I have some troubles

– Praveen
Nov 12 '18 at 11:27

Ah, that's ok I've just edited your question so If you accept my edit I've put the code inline

– Mike
Nov 12 '18 at 11:38

Thankyou for editing the code, I have approved it. Hope to see the answer soon

– Praveen
Nov 12 '18 at 11:41

|
show 4 more comments

I'm trying to find a tr by its class of .tableOne. Here is my code:

browser = webdriver.Chrome(executable_path=path, options=options)
cells = browser.find_elements_by_xpath('//*[@class="tableone"]')

But the output of the cells variable is , an empty array.

Here is the html of the page:

<tbody class="tableUpper">
 <tr class="tableone">
 <td><a class="studentName" href="//www.abc.com"> student one</a></td>
 <td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a></td>
 <td class="hide-s">
 <span class="state"></span> <span class="studentState">student_state</span>
 </td>
 </tr>
 <tr class="tableone">..</tr>
 <tr class="tableone">..</tr>
 <tr class="tableone">..</tr>
 <tr class="tableone">..</tr>
</tbody>

edited Nov 12 '18 at 11:43

asked Nov 12 '18 at 10:44

Praveen

I'm trying to find a tr by its class of .tableOne. Here is my code:

browser = webdriver.Chrome(executable_path=path, options=options)
cells = browser.find_elements_by_xpath('//*[@class="tableone"]')

But the output of the cells variable is , an empty array.

Here is the html of the page:

<tbody class="tableUpper">
 <tr class="tableone">
 <td><a class="studentName" href="//www.abc.com"> student one</a></td>
 <td><a href="//www.abc.com/overview"> <span class="id_one"></span> <span class="long">Place</span> <span class="short">Place</span></a></td>
 <td class="hide-s">
 <span class="state"></span> <span class="studentState">student_state</span>
 </td>
 </tr>
 <tr class="tableone">..</tr>
 <tr class="tableone">..</tr>
 <tr class="tableone">..</tr>
 <tr class="tableone">..</tr>
</tbody>

python-3.x selenium-webdriver xpath

edited Nov 12 '18 at 11:43

asked Nov 12 '18 at 10:44

Praveen

edited Nov 12 '18 at 11:43

asked Nov 12 '18 at 10:44

Praveen

edited Nov 12 '18 at 11:43

asked Nov 12 '18 at 10:44

Praveen

asked Nov 12 '18 at 10:44

Praveen

asked Nov 12 '18 at 10:44

Praveen

Hello @Praveen, could you include the html in your post as text instead of using a image? Also one thought, it might be because you need to wait for the page to load before getting elements off it but I'm not really familiar with selenium so I can't really say that's defiantly the problem.

– Mike
Nov 12 '18 at 10:54

<tbody class="tableUpper"> <tr class="tableone"> <td><a class="studentName" href="//www.abc.com"> student one</a></td> <td><a href="//www.abc.com/overview"> Place Place</a> </td> <td class="hide-s"> student_state </td> </tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> </tbody>

– Praveen
Nov 12 '18 at 11:26

There is problem with editing the question, I am working in it, since I am new to this site, I have some troubles

– Praveen
Nov 12 '18 at 11:27

Ah, that's ok I've just edited your question so If you accept my edit I've put the code inline

– Mike
Nov 12 '18 at 11:38

Thankyou for editing the code, I have approved it. Hope to see the answer soon

– Praveen
Nov 12 '18 at 11:41

|
show 4 more comments

Hello @Praveen, could you include the html in your post as text instead of using a image? Also one thought, it might be because you need to wait for the page to load before getting elements off it but I'm not really familiar with selenium so I can't really say that's defiantly the problem.

– Mike
Nov 12 '18 at 10:54

<tbody class="tableUpper"> <tr class="tableone"> <td><a class="studentName" href="//www.abc.com"> student one</a></td> <td><a href="//www.abc.com/overview"> Place Place</a> </td> <td class="hide-s"> student_state </td> </tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> </tbody>

– Praveen
Nov 12 '18 at 11:26

There is problem with editing the question, I am working in it, since I am new to this site, I have some troubles

– Praveen
Nov 12 '18 at 11:27

Ah, that's ok I've just edited your question so If you accept my edit I've put the code inline

– Mike
Nov 12 '18 at 11:38

Thankyou for editing the code, I have approved it. Hope to see the answer soon

– Praveen
Nov 12 '18 at 11:41

Hello @Praveen, could you include the html in your post as text instead of using a image? Also one thought, it might be because you need to wait for the page to load before getting elements off it but I'm not really familiar with selenium so I can't really say that's defiantly the problem.

– Mike
Nov 12 '18 at 10:54

<tbody class="tableUpper"> <tr class="tableone"> <td><a class="studentName" href="//www.abc.com"> student one</a></td> <td><a href="//www.abc.com/overview"> Place Place</a> </td> <td class="hide-s"> student_state </td> </tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> <tr class="tableone">..</tr> </tbody>

– Praveen
Nov 12 '18 at 11:26

There is problem with editing the question, I am working in it, since I am new to this site, I have some troubles

– Praveen
Nov 12 '18 at 11:27

Ah, that's ok I've just edited your question so If you accept my edit I've put the code inline

– Mike
Nov 12 '18 at 11:38

Thankyou for editing the code, I have approved it. Hope to see the answer soon

– Praveen
Nov 12 '18 at 11:41

|
show 4 more comments

1 Answer
1

active

oldest

votes

Please try this:

import re

cells = browser.find_elements_by_xpath("//*[contains(local-name(), 'tr') and contains(@class, 'tableone')]")

for (e in cells):
 insides = e.find_elements_by_xpath("./td")
 for (i in insides):
 result = re.search('">(.*)</', i.get_attribute("outerHTML"))
 print result.group(1)

What this does is gets all the tr elements that have class tableone, then iterates through each element and lists all the tds. Then iterates through the outerHTML of each td and strips each string to get the text value.
It's quite unrefined and will return empty strings, I think. You might need to put some more work into the final product.

edited Nov 13 '18 at 13:29

answered Nov 13 '18 at 13:16

Alichino

7301617

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53260482%2ffind-elements-by-xpath-not-producing-the-desired-output-python-selenium-scrapi%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Please try this:

import re

cells = browser.find_elements_by_xpath("//*[contains(local-name(), 'tr') and contains(@class, 'tableone')]")

for (e in cells):
 insides = e.find_elements_by_xpath("./td")
 for (i in insides):
 result = re.search('">(.*)</', i.get_attribute("outerHTML"))
 print result.group(1)

edited Nov 13 '18 at 13:29

answered Nov 13 '18 at 13:16

Alichino

7301617

add a comment |

Please try this:

import re

cells = browser.find_elements_by_xpath("//*[contains(local-name(), 'tr') and contains(@class, 'tableone')]")

for (e in cells):
 insides = e.find_elements_by_xpath("./td")
 for (i in insides):
 result = re.search('">(.*)</', i.get_attribute("outerHTML"))
 print result.group(1)

edited Nov 13 '18 at 13:29

answered Nov 13 '18 at 13:16

Alichino

7301617

add a comment |

Please try this:

import re

cells = browser.find_elements_by_xpath("//*[contains(local-name(), 'tr') and contains(@class, 'tableone')]")

for (e in cells):
 insides = e.find_elements_by_xpath("./td")
 for (i in insides):
 result = re.search('">(.*)</', i.get_attribute("outerHTML"))
 print result.group(1)

edited Nov 13 '18 at 13:29

answered Nov 13 '18 at 13:16

Alichino

7301617

Please try this:

import re

cells = browser.find_elements_by_xpath("//*[contains(local-name(), 'tr') and contains(@class, 'tableone')]")

for (e in cells):
 insides = e.find_elements_by_xpath("./td")
 for (i in insides):
 result = re.search('">(.*)</', i.get_attribute("outerHTML"))
 print result.group(1)

edited Nov 13 '18 at 13:29

answered Nov 13 '18 at 13:16

Alichino

7301617

edited Nov 13 '18 at 13:29

answered Nov 13 '18 at 13:16

Alichino

7301617

answered Nov 13 '18 at 13:16

Alichino

7301617

answered Nov 13 '18 at 13:16

Alichino

7301617

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Pfthb