Why do I get scrapy response empty?
I started
scrapy shell -s USER_AGENT='Mozilla/5.0' https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798
Next step
In [5]: response
Out[5]: <405 https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798>
After inspected page element,and copied XPath
In [6]: response.xpath('//*[@id="ad-title"]').extract()
Out[6]:
Copy outerHTML
<h1 itemprop="name" id="ad-title">Brand New Modern Studio Flat £1056pcm | All Bills Included | In Willesden Green area</h1>
Image view response
Why?
xpath scrapy
add a comment |
I started
scrapy shell -s USER_AGENT='Mozilla/5.0' https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798
Next step
In [5]: response
Out[5]: <405 https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798>
After inspected page element,and copied XPath
In [6]: response.xpath('//*[@id="ad-title"]').extract()
Out[6]:
Copy outerHTML
<h1 itemprop="name" id="ad-title">Brand New Modern Studio Flat £1056pcm | All Bills Included | In Willesden Green area</h1>
Image view response
Why?
xpath scrapy
If you look closely, the response says405
. This is an error code.
– Tomalak
Nov 11 at 17:10
How to fix that?
– MikiBelavista
Nov 11 at 17:24
add a comment |
I started
scrapy shell -s USER_AGENT='Mozilla/5.0' https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798
Next step
In [5]: response
Out[5]: <405 https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798>
After inspected page element,and copied XPath
In [6]: response.xpath('//*[@id="ad-title"]').extract()
Out[6]:
Copy outerHTML
<h1 itemprop="name" id="ad-title">Brand New Modern Studio Flat £1056pcm | All Bills Included | In Willesden Green area</h1>
Image view response
Why?
xpath scrapy
I started
scrapy shell -s USER_AGENT='Mozilla/5.0' https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798
Next step
In [5]: response
Out[5]: <405 https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798>
After inspected page element,and copied XPath
In [6]: response.xpath('//*[@id="ad-title"]').extract()
Out[6]:
Copy outerHTML
<h1 itemprop="name" id="ad-title">Brand New Modern Studio Flat £1056pcm | All Bills Included | In Willesden Green area</h1>
Image view response
Why?
xpath scrapy
xpath scrapy
edited Nov 11 at 18:01
asked Nov 11 at 15:47
MikiBelavista
7841915
7841915
If you look closely, the response says405
. This is an error code.
– Tomalak
Nov 11 at 17:10
How to fix that?
– MikiBelavista
Nov 11 at 17:24
add a comment |
If you look closely, the response says405
. This is an error code.
– Tomalak
Nov 11 at 17:10
How to fix that?
– MikiBelavista
Nov 11 at 17:24
If you look closely, the response says
405
. This is an error code.– Tomalak
Nov 11 at 17:10
If you look closely, the response says
405
. This is an error code.– Tomalak
Nov 11 at 17:10
How to fix that?
– MikiBelavista
Nov 11 at 17:24
How to fix that?
– MikiBelavista
Nov 11 at 17:24
add a comment |
1 Answer
1
active
oldest
votes
Try to set the user agent to something more realistic, such as: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0
.
Some websites do some basic validation on the user agent and redirect you to some special page if they detect something weird.
scrapy shell -s USER_AGENT='Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0' https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798
>>> response.xpath('//*[@id="ad-title"]').extract()
['<h1 itemprop="name" id="ad-title">Brand New Modern Studio Flat £1056pcm | All Bills Included | In Willesden Green area</h1>']
>>>
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53250408%2fwhy-do-i-get-scrapy-response-empty%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Try to set the user agent to something more realistic, such as: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0
.
Some websites do some basic validation on the user agent and redirect you to some special page if they detect something weird.
scrapy shell -s USER_AGENT='Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0' https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798
>>> response.xpath('//*[@id="ad-title"]').extract()
['<h1 itemprop="name" id="ad-title">Brand New Modern Studio Flat £1056pcm | All Bills Included | In Willesden Green area</h1>']
>>>
add a comment |
Try to set the user agent to something more realistic, such as: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0
.
Some websites do some basic validation on the user agent and redirect you to some special page if they detect something weird.
scrapy shell -s USER_AGENT='Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0' https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798
>>> response.xpath('//*[@id="ad-title"]').extract()
['<h1 itemprop="name" id="ad-title">Brand New Modern Studio Flat £1056pcm | All Bills Included | In Willesden Green area</h1>']
>>>
add a comment |
Try to set the user agent to something more realistic, such as: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0
.
Some websites do some basic validation on the user agent and redirect you to some special page if they detect something weird.
scrapy shell -s USER_AGENT='Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0' https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798
>>> response.xpath('//*[@id="ad-title"]').extract()
['<h1 itemprop="name" id="ad-title">Brand New Modern Studio Flat £1056pcm | All Bills Included | In Willesden Green area</h1>']
>>>
Try to set the user agent to something more realistic, such as: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0
.
Some websites do some basic validation on the user agent and redirect you to some special page if they detect something weird.
scrapy shell -s USER_AGENT='Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:63.0) Gecko/20100101 Firefox/63.0' https://www.gumtree.com/p/property-to-rent/brand-new-modern-studio-flat-%C2%A31056pcm-all-bills-included-in-willesden-green-area/1303463798
>>> response.xpath('//*[@id="ad-title"]').extract()
['<h1 itemprop="name" id="ad-title">Brand New Modern Studio Flat £1056pcm | All Bills Included | In Willesden Green area</h1>']
>>>
answered Nov 12 at 3:52
Guillaume
1,1081724
1,1081724
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53250408%2fwhy-do-i-get-scrapy-response-empty%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
If you look closely, the response says
405
. This is an error code.– Tomalak
Nov 11 at 17:10
How to fix that?
– MikiBelavista
Nov 11 at 17:24