Python Requests-HTML Render() - No Content
I'd like to scrape a page, the content of which seems to be rendered by an app referenced in the html like:
<div id="app" class="app-mobile-pusher"></div>
I'm using the render() method from Requests-HTML python library like so:
with HTMLSession() as session:
p = session.post(login_url, data=payload)
r = session.get(content_url)
r.html.render()
print(r.text)
This code returns the HTML for the page without any errors, but also without any content (just HTML tags). Notes:
I've tried adding time out arguments to session.get to give the page more time to render before accessing it and other variations on syntax of the above.
Also tried adding user agent information in headers based on this answer (in order to circumvent rejection of my automated scrape)
The chromium browser did download when I first ran render()
The lack of any error messages is stumping me and it is difficult to replicate the context of this request to test on another site.
Any specific suggestions for how to solve, or ideas for how to go about troubleshooting, appreciated. (Python 3.6, Mac OS)
javascript python-3.x python-requests-html
add a comment |
I'd like to scrape a page, the content of which seems to be rendered by an app referenced in the html like:
<div id="app" class="app-mobile-pusher"></div>
I'm using the render() method from Requests-HTML python library like so:
with HTMLSession() as session:
p = session.post(login_url, data=payload)
r = session.get(content_url)
r.html.render()
print(r.text)
This code returns the HTML for the page without any errors, but also without any content (just HTML tags). Notes:
I've tried adding time out arguments to session.get to give the page more time to render before accessing it and other variations on syntax of the above.
Also tried adding user agent information in headers based on this answer (in order to circumvent rejection of my automated scrape)
The chromium browser did download when I first ran render()
The lack of any error messages is stumping me and it is difficult to replicate the context of this request to test on another site.
Any specific suggestions for how to solve, or ideas for how to go about troubleshooting, appreciated. (Python 3.6, Mac OS)
javascript python-3.x python-requests-html
add a comment |
I'd like to scrape a page, the content of which seems to be rendered by an app referenced in the html like:
<div id="app" class="app-mobile-pusher"></div>
I'm using the render() method from Requests-HTML python library like so:
with HTMLSession() as session:
p = session.post(login_url, data=payload)
r = session.get(content_url)
r.html.render()
print(r.text)
This code returns the HTML for the page without any errors, but also without any content (just HTML tags). Notes:
I've tried adding time out arguments to session.get to give the page more time to render before accessing it and other variations on syntax of the above.
Also tried adding user agent information in headers based on this answer (in order to circumvent rejection of my automated scrape)
The chromium browser did download when I first ran render()
The lack of any error messages is stumping me and it is difficult to replicate the context of this request to test on another site.
Any specific suggestions for how to solve, or ideas for how to go about troubleshooting, appreciated. (Python 3.6, Mac OS)
javascript python-3.x python-requests-html
I'd like to scrape a page, the content of which seems to be rendered by an app referenced in the html like:
<div id="app" class="app-mobile-pusher"></div>
I'm using the render() method from Requests-HTML python library like so:
with HTMLSession() as session:
p = session.post(login_url, data=payload)
r = session.get(content_url)
r.html.render()
print(r.text)
This code returns the HTML for the page without any errors, but also without any content (just HTML tags). Notes:
I've tried adding time out arguments to session.get to give the page more time to render before accessing it and other variations on syntax of the above.
Also tried adding user agent information in headers based on this answer (in order to circumvent rejection of my automated scrape)
The chromium browser did download when I first ran render()
The lack of any error messages is stumping me and it is difficult to replicate the context of this request to test on another site.
Any specific suggestions for how to solve, or ideas for how to go about troubleshooting, appreciated. (Python 3.6, Mac OS)
javascript python-3.x python-requests-html
javascript python-3.x python-requests-html
asked Nov 13 '18 at 0:41
DynekenDyneken
1115
1115
add a comment |
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53272146%2fpython-requests-html-render-no-content%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53272146%2fpython-requests-html-render-no-content%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown