How to check if Tesseract has finished processing a file?



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








0















I am just programming a software in vb.net where I try to OCR dozens of *.jpg files.



The basic idea is to manually select a folder where I have a bunch of jpg files and a second folder where txt files that Tesseract outputs, are stored.



As you know, Tesseract takes some seconds (in my case a little bit more because my computer is not fast) to process the jpg file and OCR it.



The problem is I want to OCR each jpg one by one so I need to know when Tesseract has finished processing each file. As fast I execute the CMD command with the arguments, Tesseract created an empty txt file. But I have no idea about how to check when Tesseract has finished to process the file and the VB software can launch the instructions to process the following jpg.



I have thought about checking the length in bytes of the txt file and if it's not zero, it means that the file has been processed by Tesseract.



At the moment I have a Do...Loop where I process each of jpg files and I have a nested Do...Loop that checks if txt file size is > 0 bytes. In case that is not bigger than zero bytes, it executes thread.sleep(5000).



Do Until myFileSize > 0
Thread.Sleep(5000)
Loop


Trying to sleep the code again and again while txt file size = 0 bytes.



It's the only solution I know, but it seems it doesn't performs the action I am looking for.



Which technique would you use to solve this case?










share|improve this question

















  • 1





    It has finished when you get your result string.

    – CruleD
    Nov 15 '18 at 11:23






  • 1





    I'm not really sure, but I assume since you launch Tesseract through shell (command line) that might start another process, right? You might be able to check if "Tesseract.exe" (or whatever is launched) is running... But best option would be integrating it as a whole in your project.

    – Gonzo345
    Nov 15 '18 at 11:36

















0















I am just programming a software in vb.net where I try to OCR dozens of *.jpg files.



The basic idea is to manually select a folder where I have a bunch of jpg files and a second folder where txt files that Tesseract outputs, are stored.



As you know, Tesseract takes some seconds (in my case a little bit more because my computer is not fast) to process the jpg file and OCR it.



The problem is I want to OCR each jpg one by one so I need to know when Tesseract has finished processing each file. As fast I execute the CMD command with the arguments, Tesseract created an empty txt file. But I have no idea about how to check when Tesseract has finished to process the file and the VB software can launch the instructions to process the following jpg.



I have thought about checking the length in bytes of the txt file and if it's not zero, it means that the file has been processed by Tesseract.



At the moment I have a Do...Loop where I process each of jpg files and I have a nested Do...Loop that checks if txt file size is > 0 bytes. In case that is not bigger than zero bytes, it executes thread.sleep(5000).



Do Until myFileSize > 0
Thread.Sleep(5000)
Loop


Trying to sleep the code again and again while txt file size = 0 bytes.



It's the only solution I know, but it seems it doesn't performs the action I am looking for.



Which technique would you use to solve this case?










share|improve this question

















  • 1





    It has finished when you get your result string.

    – CruleD
    Nov 15 '18 at 11:23






  • 1





    I'm not really sure, but I assume since you launch Tesseract through shell (command line) that might start another process, right? You might be able to check if "Tesseract.exe" (or whatever is launched) is running... But best option would be integrating it as a whole in your project.

    – Gonzo345
    Nov 15 '18 at 11:36













0












0








0








I am just programming a software in vb.net where I try to OCR dozens of *.jpg files.



The basic idea is to manually select a folder where I have a bunch of jpg files and a second folder where txt files that Tesseract outputs, are stored.



As you know, Tesseract takes some seconds (in my case a little bit more because my computer is not fast) to process the jpg file and OCR it.



The problem is I want to OCR each jpg one by one so I need to know when Tesseract has finished processing each file. As fast I execute the CMD command with the arguments, Tesseract created an empty txt file. But I have no idea about how to check when Tesseract has finished to process the file and the VB software can launch the instructions to process the following jpg.



I have thought about checking the length in bytes of the txt file and if it's not zero, it means that the file has been processed by Tesseract.



At the moment I have a Do...Loop where I process each of jpg files and I have a nested Do...Loop that checks if txt file size is > 0 bytes. In case that is not bigger than zero bytes, it executes thread.sleep(5000).



Do Until myFileSize > 0
Thread.Sleep(5000)
Loop


Trying to sleep the code again and again while txt file size = 0 bytes.



It's the only solution I know, but it seems it doesn't performs the action I am looking for.



Which technique would you use to solve this case?










share|improve this question














I am just programming a software in vb.net where I try to OCR dozens of *.jpg files.



The basic idea is to manually select a folder where I have a bunch of jpg files and a second folder where txt files that Tesseract outputs, are stored.



As you know, Tesseract takes some seconds (in my case a little bit more because my computer is not fast) to process the jpg file and OCR it.



The problem is I want to OCR each jpg one by one so I need to know when Tesseract has finished processing each file. As fast I execute the CMD command with the arguments, Tesseract created an empty txt file. But I have no idea about how to check when Tesseract has finished to process the file and the VB software can launch the instructions to process the following jpg.



I have thought about checking the length in bytes of the txt file and if it's not zero, it means that the file has been processed by Tesseract.



At the moment I have a Do...Loop where I process each of jpg files and I have a nested Do...Loop that checks if txt file size is > 0 bytes. In case that is not bigger than zero bytes, it executes thread.sleep(5000).



Do Until myFileSize > 0
Thread.Sleep(5000)
Loop


Trying to sleep the code again and again while txt file size = 0 bytes.



It's the only solution I know, but it seems it doesn't performs the action I am looking for.



Which technique would you use to solve this case?







vb.net winforms file-io visual-studio-2013 tesseract






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 15 '18 at 10:57









Richard SteeleRichard Steele

387




387







  • 1





    It has finished when you get your result string.

    – CruleD
    Nov 15 '18 at 11:23






  • 1





    I'm not really sure, but I assume since you launch Tesseract through shell (command line) that might start another process, right? You might be able to check if "Tesseract.exe" (or whatever is launched) is running... But best option would be integrating it as a whole in your project.

    – Gonzo345
    Nov 15 '18 at 11:36












  • 1





    It has finished when you get your result string.

    – CruleD
    Nov 15 '18 at 11:23






  • 1





    I'm not really sure, but I assume since you launch Tesseract through shell (command line) that might start another process, right? You might be able to check if "Tesseract.exe" (or whatever is launched) is running... But best option would be integrating it as a whole in your project.

    – Gonzo345
    Nov 15 '18 at 11:36







1




1





It has finished when you get your result string.

– CruleD
Nov 15 '18 at 11:23





It has finished when you get your result string.

– CruleD
Nov 15 '18 at 11:23




1




1





I'm not really sure, but I assume since you launch Tesseract through shell (command line) that might start another process, right? You might be able to check if "Tesseract.exe" (or whatever is launched) is running... But best option would be integrating it as a whole in your project.

– Gonzo345
Nov 15 '18 at 11:36





I'm not really sure, but I assume since you launch Tesseract through shell (command line) that might start another process, right? You might be able to check if "Tesseract.exe" (or whatever is launched) is running... But best option would be integrating it as a whole in your project.

– Gonzo345
Nov 15 '18 at 11:36












1 Answer
1






active

oldest

votes


















0














Tesseract has batch mode where you can provide list of files that has to be processed and it will process each and every one of them.
Have a look here.






share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53317909%2fhow-to-check-if-tesseract-has-finished-processing-a-file%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Tesseract has batch mode where you can provide list of files that has to be processed and it will process each and every one of them.
    Have a look here.






    share|improve this answer



























      0














      Tesseract has batch mode where you can provide list of files that has to be processed and it will process each and every one of them.
      Have a look here.






      share|improve this answer

























        0












        0








        0







        Tesseract has batch mode where you can provide list of files that has to be processed and it will process each and every one of them.
        Have a look here.






        share|improve this answer













        Tesseract has batch mode where you can provide list of files that has to be processed and it will process each and every one of them.
        Have a look here.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 16 '18 at 6:57









        Rajat PaliwalRajat Paliwal

        161




        161





























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53317909%2fhow-to-check-if-tesseract-has-finished-processing-a-file%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Use pre created SQLite database for Android project in kotlin

            Darth Vader #20

            Ondo