Unable to pare the href tag in python
up vote
1
down vote
favorite
I get the following output in my beautiful soup.
[Search over 301,944 datasetsn]
I need to extract only the number 301,944 in this. Please guide me how this can be done. My code so far
import requests
import re
from bs4 import BeautifulSoup
source = requests.get('https://www.data.gov/').text
soup = BeautifulSoup (source , 'lxml')
#print soup.prettify()
images = soup.find_all('small')
print images
con = images.find_all('a') // I am unable to get anchor tag here. It says anchor tag not present
print con
#for con in images.find_all('a',href=True):
#print con
#content = images.split('metrics')
#print content[1]
#images = soup.find_all('a', 'href':re.compile('d+'))
#print images
beautifulsoup
add a comment |
up vote
1
down vote
favorite
I get the following output in my beautiful soup.
[Search over 301,944 datasetsn]
I need to extract only the number 301,944 in this. Please guide me how this can be done. My code so far
import requests
import re
from bs4 import BeautifulSoup
source = requests.get('https://www.data.gov/').text
soup = BeautifulSoup (source , 'lxml')
#print soup.prettify()
images = soup.find_all('small')
print images
con = images.find_all('a') // I am unable to get anchor tag here. It says anchor tag not present
print con
#for con in images.find_all('a',href=True):
#print con
#content = images.split('metrics')
#print content[1]
#images = soup.find_all('a', 'href':re.compile('d+'))
#print images
beautifulsoup
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I get the following output in my beautiful soup.
[Search over 301,944 datasetsn]
I need to extract only the number 301,944 in this. Please guide me how this can be done. My code so far
import requests
import re
from bs4 import BeautifulSoup
source = requests.get('https://www.data.gov/').text
soup = BeautifulSoup (source , 'lxml')
#print soup.prettify()
images = soup.find_all('small')
print images
con = images.find_all('a') // I am unable to get anchor tag here. It says anchor tag not present
print con
#for con in images.find_all('a',href=True):
#print con
#content = images.split('metrics')
#print content[1]
#images = soup.find_all('a', 'href':re.compile('d+'))
#print images
beautifulsoup
I get the following output in my beautiful soup.
[Search over 301,944 datasetsn]
I need to extract only the number 301,944 in this. Please guide me how this can be done. My code so far
import requests
import re
from bs4 import BeautifulSoup
source = requests.get('https://www.data.gov/').text
soup = BeautifulSoup (source , 'lxml')
#print soup.prettify()
images = soup.find_all('small')
print images
con = images.find_all('a') // I am unable to get anchor tag here. It says anchor tag not present
print con
#for con in images.find_all('a',href=True):
#print con
#content = images.split('metrics')
#print content[1]
#images = soup.find_all('a', 'href':re.compile('d+'))
#print images
beautifulsoup
beautifulsoup
asked Nov 10 at 19:22
user1107731
64116
64116
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
accepted
There is only one <small>
tag on website.
Your images
variable references it. But you use it in a wrong way to retrive anchor tag.
If you want to retrieve text from a
tag you can get it with:
soup.find('small').a.text
where find
method returns first small element it encounters on website. If you use find_all
, you will get list of all small
elements (but there's only one small tag here).
1
its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
– user1107731
Nov 12 at 11:01
I did it on my pc, it just showed text from anchor tag. I usedfind_all()
but since its only anchor tag in small tag, I used find() to retrieve just that one.
– Dinko Pehar
Nov 12 at 11:25
1
Thank you. Now I got it.
– user1107731
Nov 12 at 15:24
Thanks. Can you please mark question as complete ? Thank you
– Dinko Pehar
Nov 12 at 17:37
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242596%2funable-to-pare-the-href-tag-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
There is only one <small>
tag on website.
Your images
variable references it. But you use it in a wrong way to retrive anchor tag.
If you want to retrieve text from a
tag you can get it with:
soup.find('small').a.text
where find
method returns first small element it encounters on website. If you use find_all
, you will get list of all small
elements (but there's only one small tag here).
1
its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
– user1107731
Nov 12 at 11:01
I did it on my pc, it just showed text from anchor tag. I usedfind_all()
but since its only anchor tag in small tag, I used find() to retrieve just that one.
– Dinko Pehar
Nov 12 at 11:25
1
Thank you. Now I got it.
– user1107731
Nov 12 at 15:24
Thanks. Can you please mark question as complete ? Thank you
– Dinko Pehar
Nov 12 at 17:37
add a comment |
up vote
0
down vote
accepted
There is only one <small>
tag on website.
Your images
variable references it. But you use it in a wrong way to retrive anchor tag.
If you want to retrieve text from a
tag you can get it with:
soup.find('small').a.text
where find
method returns first small element it encounters on website. If you use find_all
, you will get list of all small
elements (but there's only one small tag here).
1
its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
– user1107731
Nov 12 at 11:01
I did it on my pc, it just showed text from anchor tag. I usedfind_all()
but since its only anchor tag in small tag, I used find() to retrieve just that one.
– Dinko Pehar
Nov 12 at 11:25
1
Thank you. Now I got it.
– user1107731
Nov 12 at 15:24
Thanks. Can you please mark question as complete ? Thank you
– Dinko Pehar
Nov 12 at 17:37
add a comment |
up vote
0
down vote
accepted
up vote
0
down vote
accepted
There is only one <small>
tag on website.
Your images
variable references it. But you use it in a wrong way to retrive anchor tag.
If you want to retrieve text from a
tag you can get it with:
soup.find('small').a.text
where find
method returns first small element it encounters on website. If you use find_all
, you will get list of all small
elements (but there's only one small tag here).
There is only one <small>
tag on website.
Your images
variable references it. But you use it in a wrong way to retrive anchor tag.
If you want to retrieve text from a
tag you can get it with:
soup.find('small').a.text
where find
method returns first small element it encounters on website. If you use find_all
, you will get list of all small
elements (but there's only one small tag here).
answered Nov 11 at 0:04
Dinko Pehar
9522324
9522324
1
its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
– user1107731
Nov 12 at 11:01
I did it on my pc, it just showed text from anchor tag. I usedfind_all()
but since its only anchor tag in small tag, I used find() to retrieve just that one.
– Dinko Pehar
Nov 12 at 11:25
1
Thank you. Now I got it.
– user1107731
Nov 12 at 15:24
Thanks. Can you please mark question as complete ? Thank you
– Dinko Pehar
Nov 12 at 17:37
add a comment |
1
its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
– user1107731
Nov 12 at 11:01
I did it on my pc, it just showed text from anchor tag. I usedfind_all()
but since its only anchor tag in small tag, I used find() to retrieve just that one.
– Dinko Pehar
Nov 12 at 11:25
1
Thank you. Now I got it.
– user1107731
Nov 12 at 15:24
Thanks. Can you please mark question as complete ? Thank you
– Dinko Pehar
Nov 12 at 17:37
1
1
its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
– user1107731
Nov 12 at 11:01
its not working. When I do that it reports "Traceback (most recent call last): File "C:vishwamyscriptsvalue_site.py", line 7, in <module> images = soup.find_all('small').a.text File "C:Python27libsite-packagesbs4element.py", line 1884, in getattr "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. " % key AttributeError: ResultSet object has no attribute 'a'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?
– user1107731
Nov 12 at 11:01
I did it on my pc, it just showed text from anchor tag. I used
find_all()
but since its only anchor tag in small tag, I used find() to retrieve just that one.– Dinko Pehar
Nov 12 at 11:25
I did it on my pc, it just showed text from anchor tag. I used
find_all()
but since its only anchor tag in small tag, I used find() to retrieve just that one.– Dinko Pehar
Nov 12 at 11:25
1
1
Thank you. Now I got it.
– user1107731
Nov 12 at 15:24
Thank you. Now I got it.
– user1107731
Nov 12 at 15:24
Thanks. Can you please mark question as complete ? Thank you
– Dinko Pehar
Nov 12 at 17:37
Thanks. Can you please mark question as complete ? Thank you
– Dinko Pehar
Nov 12 at 17:37
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53242596%2funable-to-pare-the-href-tag-in-python%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown