How to check if Tesseract has finished processing a file?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I am just programming a software in vb.net where I try to OCR dozens of *.jpg files.
The basic idea is to manually select a folder where I have a bunch of jpg files and a second folder where txt files that Tesseract outputs, are stored.
As you know, Tesseract takes some seconds (in my case a little bit more because my computer is not fast) to process the jpg file and OCR it.
The problem is I want to OCR each jpg one by one so I need to know when Tesseract has finished processing each file. As fast I execute the CMD command with the arguments, Tesseract created an empty txt file. But I have no idea about how to check when Tesseract has finished to process the file and the VB software can launch the instructions to process the following jpg.
I have thought about checking the length in bytes of the txt file and if it's not zero, it means that the file has been processed by Tesseract.
At the moment I have a Do...Loop where I process each of jpg files and I have a nested Do...Loop that checks if txt file size is > 0 bytes. In case that is not bigger than zero bytes, it executes thread.sleep(5000).
Do Until myFileSize > 0
Thread.Sleep(5000)
Loop
Trying to sleep the code again and again while txt file size = 0 bytes.
It's the only solution I know, but it seems it doesn't performs the action I am looking for.
Which technique would you use to solve this case?
vb.net winforms file-io visual-studio-2013 tesseract
add a comment |
I am just programming a software in vb.net where I try to OCR dozens of *.jpg files.
The basic idea is to manually select a folder where I have a bunch of jpg files and a second folder where txt files that Tesseract outputs, are stored.
As you know, Tesseract takes some seconds (in my case a little bit more because my computer is not fast) to process the jpg file and OCR it.
The problem is I want to OCR each jpg one by one so I need to know when Tesseract has finished processing each file. As fast I execute the CMD command with the arguments, Tesseract created an empty txt file. But I have no idea about how to check when Tesseract has finished to process the file and the VB software can launch the instructions to process the following jpg.
I have thought about checking the length in bytes of the txt file and if it's not zero, it means that the file has been processed by Tesseract.
At the moment I have a Do...Loop where I process each of jpg files and I have a nested Do...Loop that checks if txt file size is > 0 bytes. In case that is not bigger than zero bytes, it executes thread.sleep(5000).
Do Until myFileSize > 0
Thread.Sleep(5000)
Loop
Trying to sleep the code again and again while txt file size = 0 bytes.
It's the only solution I know, but it seems it doesn't performs the action I am looking for.
Which technique would you use to solve this case?
vb.net winforms file-io visual-studio-2013 tesseract
1
It has finished when you get your result string.
– CruleD
Nov 15 '18 at 11:23
1
I'm not really sure, but I assume since you launch Tesseract through shell (command line) that might start another process, right? You might be able to check if "Tesseract.exe" (or whatever is launched) is running... But best option would be integrating it as a whole in your project.
– Gonzo345
Nov 15 '18 at 11:36
add a comment |
I am just programming a software in vb.net where I try to OCR dozens of *.jpg files.
The basic idea is to manually select a folder where I have a bunch of jpg files and a second folder where txt files that Tesseract outputs, are stored.
As you know, Tesseract takes some seconds (in my case a little bit more because my computer is not fast) to process the jpg file and OCR it.
The problem is I want to OCR each jpg one by one so I need to know when Tesseract has finished processing each file. As fast I execute the CMD command with the arguments, Tesseract created an empty txt file. But I have no idea about how to check when Tesseract has finished to process the file and the VB software can launch the instructions to process the following jpg.
I have thought about checking the length in bytes of the txt file and if it's not zero, it means that the file has been processed by Tesseract.
At the moment I have a Do...Loop where I process each of jpg files and I have a nested Do...Loop that checks if txt file size is > 0 bytes. In case that is not bigger than zero bytes, it executes thread.sleep(5000).
Do Until myFileSize > 0
Thread.Sleep(5000)
Loop
Trying to sleep the code again and again while txt file size = 0 bytes.
It's the only solution I know, but it seems it doesn't performs the action I am looking for.
Which technique would you use to solve this case?
vb.net winforms file-io visual-studio-2013 tesseract
I am just programming a software in vb.net where I try to OCR dozens of *.jpg files.
The basic idea is to manually select a folder where I have a bunch of jpg files and a second folder where txt files that Tesseract outputs, are stored.
As you know, Tesseract takes some seconds (in my case a little bit more because my computer is not fast) to process the jpg file and OCR it.
The problem is I want to OCR each jpg one by one so I need to know when Tesseract has finished processing each file. As fast I execute the CMD command with the arguments, Tesseract created an empty txt file. But I have no idea about how to check when Tesseract has finished to process the file and the VB software can launch the instructions to process the following jpg.
I have thought about checking the length in bytes of the txt file and if it's not zero, it means that the file has been processed by Tesseract.
At the moment I have a Do...Loop where I process each of jpg files and I have a nested Do...Loop that checks if txt file size is > 0 bytes. In case that is not bigger than zero bytes, it executes thread.sleep(5000).
Do Until myFileSize > 0
Thread.Sleep(5000)
Loop
Trying to sleep the code again and again while txt file size = 0 bytes.
It's the only solution I know, but it seems it doesn't performs the action I am looking for.
Which technique would you use to solve this case?
vb.net winforms file-io visual-studio-2013 tesseract
vb.net winforms file-io visual-studio-2013 tesseract
asked Nov 15 '18 at 10:57
Richard SteeleRichard Steele
387
387
1
It has finished when you get your result string.
– CruleD
Nov 15 '18 at 11:23
1
I'm not really sure, but I assume since you launch Tesseract through shell (command line) that might start another process, right? You might be able to check if "Tesseract.exe" (or whatever is launched) is running... But best option would be integrating it as a whole in your project.
– Gonzo345
Nov 15 '18 at 11:36
add a comment |
1
It has finished when you get your result string.
– CruleD
Nov 15 '18 at 11:23
1
I'm not really sure, but I assume since you launch Tesseract through shell (command line) that might start another process, right? You might be able to check if "Tesseract.exe" (or whatever is launched) is running... But best option would be integrating it as a whole in your project.
– Gonzo345
Nov 15 '18 at 11:36
1
1
It has finished when you get your result string.
– CruleD
Nov 15 '18 at 11:23
It has finished when you get your result string.
– CruleD
Nov 15 '18 at 11:23
1
1
I'm not really sure, but I assume since you launch Tesseract through shell (command line) that might start another process, right? You might be able to check if "Tesseract.exe" (or whatever is launched) is running... But best option would be integrating it as a whole in your project.
– Gonzo345
Nov 15 '18 at 11:36
I'm not really sure, but I assume since you launch Tesseract through shell (command line) that might start another process, right? You might be able to check if "Tesseract.exe" (or whatever is launched) is running... But best option would be integrating it as a whole in your project.
– Gonzo345
Nov 15 '18 at 11:36
add a comment |
1 Answer
1
active
oldest
votes
Tesseract has batch mode where you can provide list of files that has to be processed and it will process each and every one of them.
Have a look here.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53317909%2fhow-to-check-if-tesseract-has-finished-processing-a-file%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Tesseract has batch mode where you can provide list of files that has to be processed and it will process each and every one of them.
Have a look here.
add a comment |
Tesseract has batch mode where you can provide list of files that has to be processed and it will process each and every one of them.
Have a look here.
add a comment |
Tesseract has batch mode where you can provide list of files that has to be processed and it will process each and every one of them.
Have a look here.
Tesseract has batch mode where you can provide list of files that has to be processed and it will process each and every one of them.
Have a look here.
answered Nov 16 '18 at 6:57
Rajat PaliwalRajat Paliwal
161
161
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53317909%2fhow-to-check-if-tesseract-has-finished-processing-a-file%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
It has finished when you get your result string.
– CruleD
Nov 15 '18 at 11:23
1
I'm not really sure, but I assume since you launch Tesseract through shell (command line) that might start another process, right? You might be able to check if "Tesseract.exe" (or whatever is launched) is running... But best option would be integrating it as a whole in your project.
– Gonzo345
Nov 15 '18 at 11:36