Image crawling using flask and beautiful soup process does not exits










1















I am creating an app to crawl images from a website using flask to get user input. But, the application process does not end and keeps on giving this message in the terminal. I actually want to terminate flask after the images are crawled




"127.0.0.1 - - [14/Nov/2018 12:00:44] "GET /static/script.js HTTP/1.1"
304 -"




import os
import sys
import urllib.request
import requests
from urllib.parse import urljoin
from bs4 import BeautifulSoup
from flask import Flask, render_template, request, redirect

ic = Flask(__name__)

count = 0

@ic.route("/")
def main():
if count == 1:
return render_template("index.html", result=str((str(count) + " Image Downloaded !")))
else:
return render_template("index.html", result=str((str(count) + " Images Downloaded !")))


@ic.route("/get_images", methods=['POST'])
def get_images():
_url = request.form['inputURL']
try:
global count
count = 0
code = requests.get(_url)
text = code.text
soup = BeautifulSoup(text)
for img in soup.findAll('img'):
count += 1
if (img.get('src'))[0:4] == 'http':
src = img.get('src')
else:
src = urljoin(_url, img.get('src'))
download_image(src, count)
return redirect("http://localhost:5000")
except requests.exceptions.HTTPError as error:
return render_template("index.html", result=str(error))


def download_image(url, num):
try:
image_name = str(num) + ".png"
image_path = os.path.join("images/", image_name)
urllib.request.urlretrieve(url, image_path)
except ValueError:
print("Invalid URL !")
except:
print("Unknown Exception" + str(sys.exc_info()[0]))
if __name__ == "__main__":
ic.run()


index.html



<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="refresh" content="100;url=http://localhost:5000/"/>
<title>Image Crawler</title>
<link href="../static/style.css" rel="stylesheet">

</head>
<body class="body">
<div class="container">
<div class="header">
<h3 class="text-muted">Image Crawler</h3>
</div>

<div class="jumbotron">
<form name="myForm" class="form" onsubmit="return checkURL()" method="post" action="/get_images">
<h1>Enter URL</h1>
<input type="name" name="inputURL" class="input-text" id="inputURL" placeholder="URL"
required autofocus>
<br>
<button class="btn" id="btnSubmit" type="submit">Download Photos!</button>
</form>
</div>
<div class="jumbotron">
<h3> result </h3>
</div>
</div>

</body>
</html>









share|improve this question
























  • Does it download images when you give it url ? And do you have input in terminal something like GET / HTTP 1.1 200 after you submit your form ?

    – Dinko Pehar
    Nov 14 '18 at 8:51












  • Yes it downloads the image. But after all the images are downloaded the process is not terminated

    – Samrat Shrestha
    Nov 14 '18 at 8:58











  • Can you provide me index.html ?

    – Dinko Pehar
    Nov 14 '18 at 9:26











  • I think I am having problem because I am not able to terminate the flask

    – Samrat Shrestha
    Nov 14 '18 at 9:31















1















I am creating an app to crawl images from a website using flask to get user input. But, the application process does not end and keeps on giving this message in the terminal. I actually want to terminate flask after the images are crawled




"127.0.0.1 - - [14/Nov/2018 12:00:44] "GET /static/script.js HTTP/1.1"
304 -"




import os
import sys
import urllib.request
import requests
from urllib.parse import urljoin
from bs4 import BeautifulSoup
from flask import Flask, render_template, request, redirect

ic = Flask(__name__)

count = 0

@ic.route("/")
def main():
if count == 1:
return render_template("index.html", result=str((str(count) + " Image Downloaded !")))
else:
return render_template("index.html", result=str((str(count) + " Images Downloaded !")))


@ic.route("/get_images", methods=['POST'])
def get_images():
_url = request.form['inputURL']
try:
global count
count = 0
code = requests.get(_url)
text = code.text
soup = BeautifulSoup(text)
for img in soup.findAll('img'):
count += 1
if (img.get('src'))[0:4] == 'http':
src = img.get('src')
else:
src = urljoin(_url, img.get('src'))
download_image(src, count)
return redirect("http://localhost:5000")
except requests.exceptions.HTTPError as error:
return render_template("index.html", result=str(error))


def download_image(url, num):
try:
image_name = str(num) + ".png"
image_path = os.path.join("images/", image_name)
urllib.request.urlretrieve(url, image_path)
except ValueError:
print("Invalid URL !")
except:
print("Unknown Exception" + str(sys.exc_info()[0]))
if __name__ == "__main__":
ic.run()


index.html



<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="refresh" content="100;url=http://localhost:5000/"/>
<title>Image Crawler</title>
<link href="../static/style.css" rel="stylesheet">

</head>
<body class="body">
<div class="container">
<div class="header">
<h3 class="text-muted">Image Crawler</h3>
</div>

<div class="jumbotron">
<form name="myForm" class="form" onsubmit="return checkURL()" method="post" action="/get_images">
<h1>Enter URL</h1>
<input type="name" name="inputURL" class="input-text" id="inputURL" placeholder="URL"
required autofocus>
<br>
<button class="btn" id="btnSubmit" type="submit">Download Photos!</button>
</form>
</div>
<div class="jumbotron">
<h3> result </h3>
</div>
</div>

</body>
</html>









share|improve this question
























  • Does it download images when you give it url ? And do you have input in terminal something like GET / HTTP 1.1 200 after you submit your form ?

    – Dinko Pehar
    Nov 14 '18 at 8:51












  • Yes it downloads the image. But after all the images are downloaded the process is not terminated

    – Samrat Shrestha
    Nov 14 '18 at 8:58











  • Can you provide me index.html ?

    – Dinko Pehar
    Nov 14 '18 at 9:26











  • I think I am having problem because I am not able to terminate the flask

    – Samrat Shrestha
    Nov 14 '18 at 9:31













1












1








1








I am creating an app to crawl images from a website using flask to get user input. But, the application process does not end and keeps on giving this message in the terminal. I actually want to terminate flask after the images are crawled




"127.0.0.1 - - [14/Nov/2018 12:00:44] "GET /static/script.js HTTP/1.1"
304 -"




import os
import sys
import urllib.request
import requests
from urllib.parse import urljoin
from bs4 import BeautifulSoup
from flask import Flask, render_template, request, redirect

ic = Flask(__name__)

count = 0

@ic.route("/")
def main():
if count == 1:
return render_template("index.html", result=str((str(count) + " Image Downloaded !")))
else:
return render_template("index.html", result=str((str(count) + " Images Downloaded !")))


@ic.route("/get_images", methods=['POST'])
def get_images():
_url = request.form['inputURL']
try:
global count
count = 0
code = requests.get(_url)
text = code.text
soup = BeautifulSoup(text)
for img in soup.findAll('img'):
count += 1
if (img.get('src'))[0:4] == 'http':
src = img.get('src')
else:
src = urljoin(_url, img.get('src'))
download_image(src, count)
return redirect("http://localhost:5000")
except requests.exceptions.HTTPError as error:
return render_template("index.html", result=str(error))


def download_image(url, num):
try:
image_name = str(num) + ".png"
image_path = os.path.join("images/", image_name)
urllib.request.urlretrieve(url, image_path)
except ValueError:
print("Invalid URL !")
except:
print("Unknown Exception" + str(sys.exc_info()[0]))
if __name__ == "__main__":
ic.run()


index.html



<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="refresh" content="100;url=http://localhost:5000/"/>
<title>Image Crawler</title>
<link href="../static/style.css" rel="stylesheet">

</head>
<body class="body">
<div class="container">
<div class="header">
<h3 class="text-muted">Image Crawler</h3>
</div>

<div class="jumbotron">
<form name="myForm" class="form" onsubmit="return checkURL()" method="post" action="/get_images">
<h1>Enter URL</h1>
<input type="name" name="inputURL" class="input-text" id="inputURL" placeholder="URL"
required autofocus>
<br>
<button class="btn" id="btnSubmit" type="submit">Download Photos!</button>
</form>
</div>
<div class="jumbotron">
<h3> result </h3>
</div>
</div>

</body>
</html>









share|improve this question
















I am creating an app to crawl images from a website using flask to get user input. But, the application process does not end and keeps on giving this message in the terminal. I actually want to terminate flask after the images are crawled




"127.0.0.1 - - [14/Nov/2018 12:00:44] "GET /static/script.js HTTP/1.1"
304 -"




import os
import sys
import urllib.request
import requests
from urllib.parse import urljoin
from bs4 import BeautifulSoup
from flask import Flask, render_template, request, redirect

ic = Flask(__name__)

count = 0

@ic.route("/")
def main():
if count == 1:
return render_template("index.html", result=str((str(count) + " Image Downloaded !")))
else:
return render_template("index.html", result=str((str(count) + " Images Downloaded !")))


@ic.route("/get_images", methods=['POST'])
def get_images():
_url = request.form['inputURL']
try:
global count
count = 0
code = requests.get(_url)
text = code.text
soup = BeautifulSoup(text)
for img in soup.findAll('img'):
count += 1
if (img.get('src'))[0:4] == 'http':
src = img.get('src')
else:
src = urljoin(_url, img.get('src'))
download_image(src, count)
return redirect("http://localhost:5000")
except requests.exceptions.HTTPError as error:
return render_template("index.html", result=str(error))


def download_image(url, num):
try:
image_name = str(num) + ".png"
image_path = os.path.join("images/", image_name)
urllib.request.urlretrieve(url, image_path)
except ValueError:
print("Invalid URL !")
except:
print("Unknown Exception" + str(sys.exc_info()[0]))
if __name__ == "__main__":
ic.run()


index.html



<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="refresh" content="100;url=http://localhost:5000/"/>
<title>Image Crawler</title>
<link href="../static/style.css" rel="stylesheet">

</head>
<body class="body">
<div class="container">
<div class="header">
<h3 class="text-muted">Image Crawler</h3>
</div>

<div class="jumbotron">
<form name="myForm" class="form" onsubmit="return checkURL()" method="post" action="/get_images">
<h1>Enter URL</h1>
<input type="name" name="inputURL" class="input-text" id="inputURL" placeholder="URL"
required autofocus>
<br>
<button class="btn" id="btnSubmit" type="submit">Download Photos!</button>
</form>
</div>
<div class="jumbotron">
<h3> result </h3>
</div>
</div>

</body>
</html>






python flask beautifulsoup web-crawler






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 14 '18 at 9:58







Samrat Shrestha

















asked Nov 14 '18 at 7:00









Samrat ShresthaSamrat Shrestha

326




326












  • Does it download images when you give it url ? And do you have input in terminal something like GET / HTTP 1.1 200 after you submit your form ?

    – Dinko Pehar
    Nov 14 '18 at 8:51












  • Yes it downloads the image. But after all the images are downloaded the process is not terminated

    – Samrat Shrestha
    Nov 14 '18 at 8:58











  • Can you provide me index.html ?

    – Dinko Pehar
    Nov 14 '18 at 9:26











  • I think I am having problem because I am not able to terminate the flask

    – Samrat Shrestha
    Nov 14 '18 at 9:31

















  • Does it download images when you give it url ? And do you have input in terminal something like GET / HTTP 1.1 200 after you submit your form ?

    – Dinko Pehar
    Nov 14 '18 at 8:51












  • Yes it downloads the image. But after all the images are downloaded the process is not terminated

    – Samrat Shrestha
    Nov 14 '18 at 8:58











  • Can you provide me index.html ?

    – Dinko Pehar
    Nov 14 '18 at 9:26











  • I think I am having problem because I am not able to terminate the flask

    – Samrat Shrestha
    Nov 14 '18 at 9:31
















Does it download images when you give it url ? And do you have input in terminal something like GET / HTTP 1.1 200 after you submit your form ?

– Dinko Pehar
Nov 14 '18 at 8:51






Does it download images when you give it url ? And do you have input in terminal something like GET / HTTP 1.1 200 after you submit your form ?

– Dinko Pehar
Nov 14 '18 at 8:51














Yes it downloads the image. But after all the images are downloaded the process is not terminated

– Samrat Shrestha
Nov 14 '18 at 8:58





Yes it downloads the image. But after all the images are downloaded the process is not terminated

– Samrat Shrestha
Nov 14 '18 at 8:58













Can you provide me index.html ?

– Dinko Pehar
Nov 14 '18 at 9:26





Can you provide me index.html ?

– Dinko Pehar
Nov 14 '18 at 9:26













I think I am having problem because I am not able to terminate the flask

– Samrat Shrestha
Nov 14 '18 at 9:31





I think I am having problem because I am not able to terminate the flask

– Samrat Shrestha
Nov 14 '18 at 9:31












1 Answer
1






active

oldest

votes


















0














I tested out and reworked your problem and below is the solution that worked. There were many bugs and errors, but I fixed them. I won't test for your form, but if you provide and process good url, it will work probably. Comment if something goes wrong



import os
import sys
import urllib.request
import requests
from urllib.parse import urljoin
from bs4 import BeautifulSoup
from flask import Flask, render_template, request, redirect

ic = Flask(__name__)

count = 0


@ic.route("/", methods=['GET'])
def main():
if count == 1:
return render_template("index.html", result=str((str(count) + " Image Downloaded !")))
else:
return render_template("index.html", result=str((str(count) + " Images Downloaded !")))


@ic.route("/get_images", methods=['POST', 'GET'])
def get_images():
_url = 'https://www.bljesak.info' # PROVIDE URL HERE MANUALLY OR FROM A FORM
try:
global count
count = 0
code = requests.get(_url)
text = code.text
soup = BeautifulSoup(text, 'html.parser')
for img in soup.findAll('img'):
count += 1
print(img.get('src'))
if (img.get('src'))[0:4] == 'https':
src = img.get('src')
download_image(src, count)
else:
src = urljoin(_url, img.get('src'))
download_image(src, count)
return redirect("http://localhost:5000")
except requests.exceptions.HTTPError as error:
return render_template("index.html", result=str(error))


def download_image(url, num):
try:
image_name = str(num) + '.png'
image_path = os.path.join("images/", image_name)
print(image_name, image_path) # WAIT FOR ALL TO FINISH, IF THERE IS A LOT OF IMAGES, YOU NEED TO WAIT
urllib.request.urlretrieve(url, image_path)
except ValueError:
print("Invalid URL !")
except:
print("Unknown Exception" + str(sys.exc_info()[0]))


if __name__ == "__main__":
ic.run()


Also, you have input of type url if you want url. You don't need to use name type.






share|improve this answer

























  • This didn't work. I got an error "The method is not allowed for the requested URL." I also tried with the url you had provided.

    – Samrat Shrestha
    Nov 14 '18 at 9:54











  • Did you copy mine code or just entered url ?

    – Dinko Pehar
    Nov 14 '18 at 10:02











  • I tried both but both didnt work

    – Samrat Shrestha
    Nov 14 '18 at 10:03











  • I added methods to flask decorators. Can you try that ? Btw, you must have images folder created before you start downloading. It should be at top of the project

    – Dinko Pehar
    Nov 14 '18 at 10:04







  • 1





    Yes, I have image folders but the process does not exits even with your code. I guess you are getting my problem

    – Samrat Shrestha
    Nov 14 '18 at 10:10










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53294693%2fimage-crawling-using-flask-and-beautiful-soup-process-does-not-exits%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














I tested out and reworked your problem and below is the solution that worked. There were many bugs and errors, but I fixed them. I won't test for your form, but if you provide and process good url, it will work probably. Comment if something goes wrong



import os
import sys
import urllib.request
import requests
from urllib.parse import urljoin
from bs4 import BeautifulSoup
from flask import Flask, render_template, request, redirect

ic = Flask(__name__)

count = 0


@ic.route("/", methods=['GET'])
def main():
if count == 1:
return render_template("index.html", result=str((str(count) + " Image Downloaded !")))
else:
return render_template("index.html", result=str((str(count) + " Images Downloaded !")))


@ic.route("/get_images", methods=['POST', 'GET'])
def get_images():
_url = 'https://www.bljesak.info' # PROVIDE URL HERE MANUALLY OR FROM A FORM
try:
global count
count = 0
code = requests.get(_url)
text = code.text
soup = BeautifulSoup(text, 'html.parser')
for img in soup.findAll('img'):
count += 1
print(img.get('src'))
if (img.get('src'))[0:4] == 'https':
src = img.get('src')
download_image(src, count)
else:
src = urljoin(_url, img.get('src'))
download_image(src, count)
return redirect("http://localhost:5000")
except requests.exceptions.HTTPError as error:
return render_template("index.html", result=str(error))


def download_image(url, num):
try:
image_name = str(num) + '.png'
image_path = os.path.join("images/", image_name)
print(image_name, image_path) # WAIT FOR ALL TO FINISH, IF THERE IS A LOT OF IMAGES, YOU NEED TO WAIT
urllib.request.urlretrieve(url, image_path)
except ValueError:
print("Invalid URL !")
except:
print("Unknown Exception" + str(sys.exc_info()[0]))


if __name__ == "__main__":
ic.run()


Also, you have input of type url if you want url. You don't need to use name type.






share|improve this answer

























  • This didn't work. I got an error "The method is not allowed for the requested URL." I also tried with the url you had provided.

    – Samrat Shrestha
    Nov 14 '18 at 9:54











  • Did you copy mine code or just entered url ?

    – Dinko Pehar
    Nov 14 '18 at 10:02











  • I tried both but both didnt work

    – Samrat Shrestha
    Nov 14 '18 at 10:03











  • I added methods to flask decorators. Can you try that ? Btw, you must have images folder created before you start downloading. It should be at top of the project

    – Dinko Pehar
    Nov 14 '18 at 10:04







  • 1





    Yes, I have image folders but the process does not exits even with your code. I guess you are getting my problem

    – Samrat Shrestha
    Nov 14 '18 at 10:10















0














I tested out and reworked your problem and below is the solution that worked. There were many bugs and errors, but I fixed them. I won't test for your form, but if you provide and process good url, it will work probably. Comment if something goes wrong



import os
import sys
import urllib.request
import requests
from urllib.parse import urljoin
from bs4 import BeautifulSoup
from flask import Flask, render_template, request, redirect

ic = Flask(__name__)

count = 0


@ic.route("/", methods=['GET'])
def main():
if count == 1:
return render_template("index.html", result=str((str(count) + " Image Downloaded !")))
else:
return render_template("index.html", result=str((str(count) + " Images Downloaded !")))


@ic.route("/get_images", methods=['POST', 'GET'])
def get_images():
_url = 'https://www.bljesak.info' # PROVIDE URL HERE MANUALLY OR FROM A FORM
try:
global count
count = 0
code = requests.get(_url)
text = code.text
soup = BeautifulSoup(text, 'html.parser')
for img in soup.findAll('img'):
count += 1
print(img.get('src'))
if (img.get('src'))[0:4] == 'https':
src = img.get('src')
download_image(src, count)
else:
src = urljoin(_url, img.get('src'))
download_image(src, count)
return redirect("http://localhost:5000")
except requests.exceptions.HTTPError as error:
return render_template("index.html", result=str(error))


def download_image(url, num):
try:
image_name = str(num) + '.png'
image_path = os.path.join("images/", image_name)
print(image_name, image_path) # WAIT FOR ALL TO FINISH, IF THERE IS A LOT OF IMAGES, YOU NEED TO WAIT
urllib.request.urlretrieve(url, image_path)
except ValueError:
print("Invalid URL !")
except:
print("Unknown Exception" + str(sys.exc_info()[0]))


if __name__ == "__main__":
ic.run()


Also, you have input of type url if you want url. You don't need to use name type.






share|improve this answer

























  • This didn't work. I got an error "The method is not allowed for the requested URL." I also tried with the url you had provided.

    – Samrat Shrestha
    Nov 14 '18 at 9:54











  • Did you copy mine code or just entered url ?

    – Dinko Pehar
    Nov 14 '18 at 10:02











  • I tried both but both didnt work

    – Samrat Shrestha
    Nov 14 '18 at 10:03











  • I added methods to flask decorators. Can you try that ? Btw, you must have images folder created before you start downloading. It should be at top of the project

    – Dinko Pehar
    Nov 14 '18 at 10:04







  • 1





    Yes, I have image folders but the process does not exits even with your code. I guess you are getting my problem

    – Samrat Shrestha
    Nov 14 '18 at 10:10













0












0








0







I tested out and reworked your problem and below is the solution that worked. There were many bugs and errors, but I fixed them. I won't test for your form, but if you provide and process good url, it will work probably. Comment if something goes wrong



import os
import sys
import urllib.request
import requests
from urllib.parse import urljoin
from bs4 import BeautifulSoup
from flask import Flask, render_template, request, redirect

ic = Flask(__name__)

count = 0


@ic.route("/", methods=['GET'])
def main():
if count == 1:
return render_template("index.html", result=str((str(count) + " Image Downloaded !")))
else:
return render_template("index.html", result=str((str(count) + " Images Downloaded !")))


@ic.route("/get_images", methods=['POST', 'GET'])
def get_images():
_url = 'https://www.bljesak.info' # PROVIDE URL HERE MANUALLY OR FROM A FORM
try:
global count
count = 0
code = requests.get(_url)
text = code.text
soup = BeautifulSoup(text, 'html.parser')
for img in soup.findAll('img'):
count += 1
print(img.get('src'))
if (img.get('src'))[0:4] == 'https':
src = img.get('src')
download_image(src, count)
else:
src = urljoin(_url, img.get('src'))
download_image(src, count)
return redirect("http://localhost:5000")
except requests.exceptions.HTTPError as error:
return render_template("index.html", result=str(error))


def download_image(url, num):
try:
image_name = str(num) + '.png'
image_path = os.path.join("images/", image_name)
print(image_name, image_path) # WAIT FOR ALL TO FINISH, IF THERE IS A LOT OF IMAGES, YOU NEED TO WAIT
urllib.request.urlretrieve(url, image_path)
except ValueError:
print("Invalid URL !")
except:
print("Unknown Exception" + str(sys.exc_info()[0]))


if __name__ == "__main__":
ic.run()


Also, you have input of type url if you want url. You don't need to use name type.






share|improve this answer















I tested out and reworked your problem and below is the solution that worked. There were many bugs and errors, but I fixed them. I won't test for your form, but if you provide and process good url, it will work probably. Comment if something goes wrong



import os
import sys
import urllib.request
import requests
from urllib.parse import urljoin
from bs4 import BeautifulSoup
from flask import Flask, render_template, request, redirect

ic = Flask(__name__)

count = 0


@ic.route("/", methods=['GET'])
def main():
if count == 1:
return render_template("index.html", result=str((str(count) + " Image Downloaded !")))
else:
return render_template("index.html", result=str((str(count) + " Images Downloaded !")))


@ic.route("/get_images", methods=['POST', 'GET'])
def get_images():
_url = 'https://www.bljesak.info' # PROVIDE URL HERE MANUALLY OR FROM A FORM
try:
global count
count = 0
code = requests.get(_url)
text = code.text
soup = BeautifulSoup(text, 'html.parser')
for img in soup.findAll('img'):
count += 1
print(img.get('src'))
if (img.get('src'))[0:4] == 'https':
src = img.get('src')
download_image(src, count)
else:
src = urljoin(_url, img.get('src'))
download_image(src, count)
return redirect("http://localhost:5000")
except requests.exceptions.HTTPError as error:
return render_template("index.html", result=str(error))


def download_image(url, num):
try:
image_name = str(num) + '.png'
image_path = os.path.join("images/", image_name)
print(image_name, image_path) # WAIT FOR ALL TO FINISH, IF THERE IS A LOT OF IMAGES, YOU NEED TO WAIT
urllib.request.urlretrieve(url, image_path)
except ValueError:
print("Invalid URL !")
except:
print("Unknown Exception" + str(sys.exc_info()[0]))


if __name__ == "__main__":
ic.run()


Also, you have input of type url if you want url. You don't need to use name type.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 14 '18 at 10:04

























answered Nov 14 '18 at 9:46









Dinko PeharDinko Pehar

1,5313424




1,5313424












  • This didn't work. I got an error "The method is not allowed for the requested URL." I also tried with the url you had provided.

    – Samrat Shrestha
    Nov 14 '18 at 9:54











  • Did you copy mine code or just entered url ?

    – Dinko Pehar
    Nov 14 '18 at 10:02











  • I tried both but both didnt work

    – Samrat Shrestha
    Nov 14 '18 at 10:03











  • I added methods to flask decorators. Can you try that ? Btw, you must have images folder created before you start downloading. It should be at top of the project

    – Dinko Pehar
    Nov 14 '18 at 10:04







  • 1





    Yes, I have image folders but the process does not exits even with your code. I guess you are getting my problem

    – Samrat Shrestha
    Nov 14 '18 at 10:10

















  • This didn't work. I got an error "The method is not allowed for the requested URL." I also tried with the url you had provided.

    – Samrat Shrestha
    Nov 14 '18 at 9:54











  • Did you copy mine code or just entered url ?

    – Dinko Pehar
    Nov 14 '18 at 10:02











  • I tried both but both didnt work

    – Samrat Shrestha
    Nov 14 '18 at 10:03











  • I added methods to flask decorators. Can you try that ? Btw, you must have images folder created before you start downloading. It should be at top of the project

    – Dinko Pehar
    Nov 14 '18 at 10:04







  • 1





    Yes, I have image folders but the process does not exits even with your code. I guess you are getting my problem

    – Samrat Shrestha
    Nov 14 '18 at 10:10
















This didn't work. I got an error "The method is not allowed for the requested URL." I also tried with the url you had provided.

– Samrat Shrestha
Nov 14 '18 at 9:54





This didn't work. I got an error "The method is not allowed for the requested URL." I also tried with the url you had provided.

– Samrat Shrestha
Nov 14 '18 at 9:54













Did you copy mine code or just entered url ?

– Dinko Pehar
Nov 14 '18 at 10:02





Did you copy mine code or just entered url ?

– Dinko Pehar
Nov 14 '18 at 10:02













I tried both but both didnt work

– Samrat Shrestha
Nov 14 '18 at 10:03





I tried both but both didnt work

– Samrat Shrestha
Nov 14 '18 at 10:03













I added methods to flask decorators. Can you try that ? Btw, you must have images folder created before you start downloading. It should be at top of the project

– Dinko Pehar
Nov 14 '18 at 10:04






I added methods to flask decorators. Can you try that ? Btw, you must have images folder created before you start downloading. It should be at top of the project

– Dinko Pehar
Nov 14 '18 at 10:04





1




1





Yes, I have image folders but the process does not exits even with your code. I guess you are getting my problem

– Samrat Shrestha
Nov 14 '18 at 10:10





Yes, I have image folders but the process does not exits even with your code. I guess you are getting my problem

– Samrat Shrestha
Nov 14 '18 at 10:10



















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53294693%2fimage-crawling-using-flask-and-beautiful-soup-process-does-not-exits%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Use pre created SQLite database for Android project in kotlin

Darth Vader #20

Ondo