No module named 'resource' installing Apache Spark on Windows










6















I am trying to install apache spark to run locally on my windows machine. I have followed all instructions here https://medium.com/@loldja/installing-apache-spark-pyspark-the-missing-quick-start-guide-for-windows-ad81702ba62d.



After this installation I am able to successfully start pyspark, and execute a command such as



textFile = sc.textFile("README.md")


When I then execute a command that operates on textFile such as



textFile.first()


Spark gives me the error 'worker failed to connect back', and I can see an exception in the console coming from worker.py saying 'ModuleNotFoundError: No module named resource'. Looking at the source file I can see that this python file does indeed try to import the resource module, however this module is not available on windows systems. I understand that you can install spark on windows so how do I get around this?










share|improve this question

















  • 3





    As below, a change was introduced in Spark 2.4.0 which breaks worker.py on Windows. For now, downgrading to 2.3.2 works. I have raised this as an issue here

    – Hayden
    Nov 16 '18 at 1:53
















6















I am trying to install apache spark to run locally on my windows machine. I have followed all instructions here https://medium.com/@loldja/installing-apache-spark-pyspark-the-missing-quick-start-guide-for-windows-ad81702ba62d.



After this installation I am able to successfully start pyspark, and execute a command such as



textFile = sc.textFile("README.md")


When I then execute a command that operates on textFile such as



textFile.first()


Spark gives me the error 'worker failed to connect back', and I can see an exception in the console coming from worker.py saying 'ModuleNotFoundError: No module named resource'. Looking at the source file I can see that this python file does indeed try to import the resource module, however this module is not available on windows systems. I understand that you can install spark on windows so how do I get around this?










share|improve this question

















  • 3





    As below, a change was introduced in Spark 2.4.0 which breaks worker.py on Windows. For now, downgrading to 2.3.2 works. I have raised this as an issue here

    – Hayden
    Nov 16 '18 at 1:53














6












6








6


6






I am trying to install apache spark to run locally on my windows machine. I have followed all instructions here https://medium.com/@loldja/installing-apache-spark-pyspark-the-missing-quick-start-guide-for-windows-ad81702ba62d.



After this installation I am able to successfully start pyspark, and execute a command such as



textFile = sc.textFile("README.md")


When I then execute a command that operates on textFile such as



textFile.first()


Spark gives me the error 'worker failed to connect back', and I can see an exception in the console coming from worker.py saying 'ModuleNotFoundError: No module named resource'. Looking at the source file I can see that this python file does indeed try to import the resource module, however this module is not available on windows systems. I understand that you can install spark on windows so how do I get around this?










share|improve this question














I am trying to install apache spark to run locally on my windows machine. I have followed all instructions here https://medium.com/@loldja/installing-apache-spark-pyspark-the-missing-quick-start-guide-for-windows-ad81702ba62d.



After this installation I am able to successfully start pyspark, and execute a command such as



textFile = sc.textFile("README.md")


When I then execute a command that operates on textFile such as



textFile.first()


Spark gives me the error 'worker failed to connect back', and I can see an exception in the console coming from worker.py saying 'ModuleNotFoundError: No module named resource'. Looking at the source file I can see that this python file does indeed try to import the resource module, however this module is not available on windows systems. I understand that you can install spark on windows so how do I get around this?







python windows apache-spark






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 13 '18 at 2:42









HaydenHayden

333




333







  • 3





    As below, a change was introduced in Spark 2.4.0 which breaks worker.py on Windows. For now, downgrading to 2.3.2 works. I have raised this as an issue here

    – Hayden
    Nov 16 '18 at 1:53













  • 3





    As below, a change was introduced in Spark 2.4.0 which breaks worker.py on Windows. For now, downgrading to 2.3.2 works. I have raised this as an issue here

    – Hayden
    Nov 16 '18 at 1:53








3




3





As below, a change was introduced in Spark 2.4.0 which breaks worker.py on Windows. For now, downgrading to 2.3.2 works. I have raised this as an issue here

– Hayden
Nov 16 '18 at 1:53






As below, a change was introduced in Spark 2.4.0 which breaks worker.py on Windows. For now, downgrading to 2.3.2 works. I have raised this as an issue here

– Hayden
Nov 16 '18 at 1:53













2 Answers
2






active

oldest

votes


















12














I struggled the whole morning with the same problem. Your best bet is to downgrade to Spark 2.3.2






share|improve this answer


















  • 1





    Thank you very much! I didn't specifically need version 2.4.0 so this works for me

    – Hayden
    Nov 15 '18 at 9:43











  • Happy to help :)

    – Luv
    Nov 17 '18 at 13:21






  • 2





    Thanks, this fixed the same issue for me.

    – Simon Peacock
    Dec 4 '18 at 0:21











  • I wish I could say the same. After I got the same error after rolling back to 2.3.2 I rolled hadoop back as well to 2.7.7 but alas, I still get the error.

    – M T
    Dec 10 '18 at 19:06











  • MT, also try commenting 127.0.0.1 on your System32driversetchosts file

    – gargkshitiz
    Jan 16 at 11:02


















4














The fix can be found at https://github.com/apache/spark/pull/23055.



The resource module is only for Unix/Linux systems and is not applicaple in a windows environment. This fix is not yet included in the latest release but you can modify the worker.py in your installation as shown in the pull request. The changes to that file can be found at https://github.com/apache/spark/pull/23055/files.



You will have to re-zip the pyspark directory and move it the lib folder in your pyspark installation directory (where you extracted the pre-compiled pyspark according to the tutorial you mentioned)






share|improve this answer






















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273017%2fno-module-named-resource-installing-apache-spark-on-windows%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    12














    I struggled the whole morning with the same problem. Your best bet is to downgrade to Spark 2.3.2






    share|improve this answer


















    • 1





      Thank you very much! I didn't specifically need version 2.4.0 so this works for me

      – Hayden
      Nov 15 '18 at 9:43











    • Happy to help :)

      – Luv
      Nov 17 '18 at 13:21






    • 2





      Thanks, this fixed the same issue for me.

      – Simon Peacock
      Dec 4 '18 at 0:21











    • I wish I could say the same. After I got the same error after rolling back to 2.3.2 I rolled hadoop back as well to 2.7.7 but alas, I still get the error.

      – M T
      Dec 10 '18 at 19:06











    • MT, also try commenting 127.0.0.1 on your System32driversetchosts file

      – gargkshitiz
      Jan 16 at 11:02















    12














    I struggled the whole morning with the same problem. Your best bet is to downgrade to Spark 2.3.2






    share|improve this answer


















    • 1





      Thank you very much! I didn't specifically need version 2.4.0 so this works for me

      – Hayden
      Nov 15 '18 at 9:43











    • Happy to help :)

      – Luv
      Nov 17 '18 at 13:21






    • 2





      Thanks, this fixed the same issue for me.

      – Simon Peacock
      Dec 4 '18 at 0:21











    • I wish I could say the same. After I got the same error after rolling back to 2.3.2 I rolled hadoop back as well to 2.7.7 but alas, I still get the error.

      – M T
      Dec 10 '18 at 19:06











    • MT, also try commenting 127.0.0.1 on your System32driversetchosts file

      – gargkshitiz
      Jan 16 at 11:02













    12












    12








    12







    I struggled the whole morning with the same problem. Your best bet is to downgrade to Spark 2.3.2






    share|improve this answer













    I struggled the whole morning with the same problem. Your best bet is to downgrade to Spark 2.3.2







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 14 '18 at 8:00









    LuvLuv

    15919




    15919







    • 1





      Thank you very much! I didn't specifically need version 2.4.0 so this works for me

      – Hayden
      Nov 15 '18 at 9:43











    • Happy to help :)

      – Luv
      Nov 17 '18 at 13:21






    • 2





      Thanks, this fixed the same issue for me.

      – Simon Peacock
      Dec 4 '18 at 0:21











    • I wish I could say the same. After I got the same error after rolling back to 2.3.2 I rolled hadoop back as well to 2.7.7 but alas, I still get the error.

      – M T
      Dec 10 '18 at 19:06











    • MT, also try commenting 127.0.0.1 on your System32driversetchosts file

      – gargkshitiz
      Jan 16 at 11:02












    • 1





      Thank you very much! I didn't specifically need version 2.4.0 so this works for me

      – Hayden
      Nov 15 '18 at 9:43











    • Happy to help :)

      – Luv
      Nov 17 '18 at 13:21






    • 2





      Thanks, this fixed the same issue for me.

      – Simon Peacock
      Dec 4 '18 at 0:21











    • I wish I could say the same. After I got the same error after rolling back to 2.3.2 I rolled hadoop back as well to 2.7.7 but alas, I still get the error.

      – M T
      Dec 10 '18 at 19:06











    • MT, also try commenting 127.0.0.1 on your System32driversetchosts file

      – gargkshitiz
      Jan 16 at 11:02







    1




    1





    Thank you very much! I didn't specifically need version 2.4.0 so this works for me

    – Hayden
    Nov 15 '18 at 9:43





    Thank you very much! I didn't specifically need version 2.4.0 so this works for me

    – Hayden
    Nov 15 '18 at 9:43













    Happy to help :)

    – Luv
    Nov 17 '18 at 13:21





    Happy to help :)

    – Luv
    Nov 17 '18 at 13:21




    2




    2





    Thanks, this fixed the same issue for me.

    – Simon Peacock
    Dec 4 '18 at 0:21





    Thanks, this fixed the same issue for me.

    – Simon Peacock
    Dec 4 '18 at 0:21













    I wish I could say the same. After I got the same error after rolling back to 2.3.2 I rolled hadoop back as well to 2.7.7 but alas, I still get the error.

    – M T
    Dec 10 '18 at 19:06





    I wish I could say the same. After I got the same error after rolling back to 2.3.2 I rolled hadoop back as well to 2.7.7 but alas, I still get the error.

    – M T
    Dec 10 '18 at 19:06













    MT, also try commenting 127.0.0.1 on your System32driversetchosts file

    – gargkshitiz
    Jan 16 at 11:02





    MT, also try commenting 127.0.0.1 on your System32driversetchosts file

    – gargkshitiz
    Jan 16 at 11:02













    4














    The fix can be found at https://github.com/apache/spark/pull/23055.



    The resource module is only for Unix/Linux systems and is not applicaple in a windows environment. This fix is not yet included in the latest release but you can modify the worker.py in your installation as shown in the pull request. The changes to that file can be found at https://github.com/apache/spark/pull/23055/files.



    You will have to re-zip the pyspark directory and move it the lib folder in your pyspark installation directory (where you extracted the pre-compiled pyspark according to the tutorial you mentioned)






    share|improve this answer



























      4














      The fix can be found at https://github.com/apache/spark/pull/23055.



      The resource module is only for Unix/Linux systems and is not applicaple in a windows environment. This fix is not yet included in the latest release but you can modify the worker.py in your installation as shown in the pull request. The changes to that file can be found at https://github.com/apache/spark/pull/23055/files.



      You will have to re-zip the pyspark directory and move it the lib folder in your pyspark installation directory (where you extracted the pre-compiled pyspark according to the tutorial you mentioned)






      share|improve this answer

























        4












        4








        4







        The fix can be found at https://github.com/apache/spark/pull/23055.



        The resource module is only for Unix/Linux systems and is not applicaple in a windows environment. This fix is not yet included in the latest release but you can modify the worker.py in your installation as shown in the pull request. The changes to that file can be found at https://github.com/apache/spark/pull/23055/files.



        You will have to re-zip the pyspark directory and move it the lib folder in your pyspark installation directory (where you extracted the pre-compiled pyspark according to the tutorial you mentioned)






        share|improve this answer













        The fix can be found at https://github.com/apache/spark/pull/23055.



        The resource module is only for Unix/Linux systems and is not applicaple in a windows environment. This fix is not yet included in the latest release but you can modify the worker.py in your installation as shown in the pull request. The changes to that file can be found at https://github.com/apache/spark/pull/23055/files.



        You will have to re-zip the pyspark directory and move it the lib folder in your pyspark installation directory (where you extracted the pre-compiled pyspark according to the tutorial you mentioned)







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Dec 26 '18 at 18:23









        p1g1np1g1n

        1909




        1909



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273017%2fno-module-named-resource-installing-apache-spark-on-windows%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to how show current date and time by default on contact form 7 in WordPress without taking input from user in datetimepicker

            Syphilis

            Darth Vader #20