Unsupervised sentiment Analysis using doc2vec









up vote
2
down vote

favorite












Folks,



I have searched Google for different type of papers/blogs/tutorials etc but haven't found anything helpful. I would appreciate if anyone can help me. Please note that I am not asking for code step-by-step but rather an idea/blog/paper or some tutorial.



Here's my problem statement:




Just like sentiment analysis is used for identifying positive and
negative tone of a sentence, I want to find whether a sentence is
forward-looking (future outlook) statement or not.




I do not want to use bag of words approach to sum up the number of forward-looking words/phrases such as "going forward", "in near future" or "In 5 years from now" etc. I am not sure if word2vec or doc2vec can be used. Please enlighten me.



Thanks.










share|improve this question





















  • Why don't you want to use a bag-of-words technique based on words/phrases that appear in such statements? It might work well! Similarly, some approach using word2vec/doc2vec embeddings might prove helpful – you'd have to try it. What have you tried so far? What kind of training dataset do you have, or expect to be able to create?
    – gojomo
    Nov 10 at 0:54














up vote
2
down vote

favorite












Folks,



I have searched Google for different type of papers/blogs/tutorials etc but haven't found anything helpful. I would appreciate if anyone can help me. Please note that I am not asking for code step-by-step but rather an idea/blog/paper or some tutorial.



Here's my problem statement:




Just like sentiment analysis is used for identifying positive and
negative tone of a sentence, I want to find whether a sentence is
forward-looking (future outlook) statement or not.




I do not want to use bag of words approach to sum up the number of forward-looking words/phrases such as "going forward", "in near future" or "In 5 years from now" etc. I am not sure if word2vec or doc2vec can be used. Please enlighten me.



Thanks.










share|improve this question





















  • Why don't you want to use a bag-of-words technique based on words/phrases that appear in such statements? It might work well! Similarly, some approach using word2vec/doc2vec embeddings might prove helpful – you'd have to try it. What have you tried so far? What kind of training dataset do you have, or expect to be able to create?
    – gojomo
    Nov 10 at 0:54












up vote
2
down vote

favorite









up vote
2
down vote

favorite











Folks,



I have searched Google for different type of papers/blogs/tutorials etc but haven't found anything helpful. I would appreciate if anyone can help me. Please note that I am not asking for code step-by-step but rather an idea/blog/paper or some tutorial.



Here's my problem statement:




Just like sentiment analysis is used for identifying positive and
negative tone of a sentence, I want to find whether a sentence is
forward-looking (future outlook) statement or not.




I do not want to use bag of words approach to sum up the number of forward-looking words/phrases such as "going forward", "in near future" or "In 5 years from now" etc. I am not sure if word2vec or doc2vec can be used. Please enlighten me.



Thanks.










share|improve this question













Folks,



I have searched Google for different type of papers/blogs/tutorials etc but haven't found anything helpful. I would appreciate if anyone can help me. Please note that I am not asking for code step-by-step but rather an idea/blog/paper or some tutorial.



Here's my problem statement:




Just like sentiment analysis is used for identifying positive and
negative tone of a sentence, I want to find whether a sentence is
forward-looking (future outlook) statement or not.




I do not want to use bag of words approach to sum up the number of forward-looking words/phrases such as "going forward", "in near future" or "In 5 years from now" etc. I am not sure if word2vec or doc2vec can be used. Please enlighten me.



Thanks.







nlp gensim word2vec sentiment-analysis doc2vec






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 9 at 20:32









sgokhales

34.9k26105138




34.9k26105138











  • Why don't you want to use a bag-of-words technique based on words/phrases that appear in such statements? It might work well! Similarly, some approach using word2vec/doc2vec embeddings might prove helpful – you'd have to try it. What have you tried so far? What kind of training dataset do you have, or expect to be able to create?
    – gojomo
    Nov 10 at 0:54
















  • Why don't you want to use a bag-of-words technique based on words/phrases that appear in such statements? It might work well! Similarly, some approach using word2vec/doc2vec embeddings might prove helpful – you'd have to try it. What have you tried so far? What kind of training dataset do you have, or expect to be able to create?
    – gojomo
    Nov 10 at 0:54















Why don't you want to use a bag-of-words technique based on words/phrases that appear in such statements? It might work well! Similarly, some approach using word2vec/doc2vec embeddings might prove helpful – you'd have to try it. What have you tried so far? What kind of training dataset do you have, or expect to be able to create?
– gojomo
Nov 10 at 0:54




Why don't you want to use a bag-of-words technique based on words/phrases that appear in such statements? It might work well! Similarly, some approach using word2vec/doc2vec embeddings might prove helpful – you'd have to try it. What have you tried so far? What kind of training dataset do you have, or expect to be able to create?
– gojomo
Nov 10 at 0:54












1 Answer
1






active

oldest

votes

















up vote
0
down vote













It seems what you are interested in doing is finding temporal statements in texts.



Not sure of your final output, but let's assume you want to find temporal phrases or sentences which contain them.



One methodology could be the following:



  1. Create list of temporal terms [days, years, months, now, later]

  2. Pick only sentences with key terms

  3. Use sentences in doc2vec model

  4. Infer vector and use distance metric for new sentence

    • GMM Cluster + Limit

    • Distance from average


Another methodology could be:



  1. Create list of temporal terms [days, years, months, now, later]

  2. Do Bigram and Trigram collocation extraction

  3. Keep relevant collocations with temporal terms

  4. Use relevant collocations in a kind of bag-of-collocations approach

    • Matched binary feature vectors for relevant collocations

    • Train classifier to recognise higher level text


This sounds like a good case for a Bootstrapping approach if you have large amounts of texts.



Both are semi-supervised really, since there is some need for finding initial temporal terms, but even that could be automated using a word2vec scheme and bootstrapping






share|improve this answer




















    Your Answer






    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53232894%2funsupervised-sentiment-analysis-using-doc2vec%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    It seems what you are interested in doing is finding temporal statements in texts.



    Not sure of your final output, but let's assume you want to find temporal phrases or sentences which contain them.



    One methodology could be the following:



    1. Create list of temporal terms [days, years, months, now, later]

    2. Pick only sentences with key terms

    3. Use sentences in doc2vec model

    4. Infer vector and use distance metric for new sentence

      • GMM Cluster + Limit

      • Distance from average


    Another methodology could be:



    1. Create list of temporal terms [days, years, months, now, later]

    2. Do Bigram and Trigram collocation extraction

    3. Keep relevant collocations with temporal terms

    4. Use relevant collocations in a kind of bag-of-collocations approach

      • Matched binary feature vectors for relevant collocations

      • Train classifier to recognise higher level text


    This sounds like a good case for a Bootstrapping approach if you have large amounts of texts.



    Both are semi-supervised really, since there is some need for finding initial temporal terms, but even that could be automated using a word2vec scheme and bootstrapping






    share|improve this answer
























      up vote
      0
      down vote













      It seems what you are interested in doing is finding temporal statements in texts.



      Not sure of your final output, but let's assume you want to find temporal phrases or sentences which contain them.



      One methodology could be the following:



      1. Create list of temporal terms [days, years, months, now, later]

      2. Pick only sentences with key terms

      3. Use sentences in doc2vec model

      4. Infer vector and use distance metric for new sentence

        • GMM Cluster + Limit

        • Distance from average


      Another methodology could be:



      1. Create list of temporal terms [days, years, months, now, later]

      2. Do Bigram and Trigram collocation extraction

      3. Keep relevant collocations with temporal terms

      4. Use relevant collocations in a kind of bag-of-collocations approach

        • Matched binary feature vectors for relevant collocations

        • Train classifier to recognise higher level text


      This sounds like a good case for a Bootstrapping approach if you have large amounts of texts.



      Both are semi-supervised really, since there is some need for finding initial temporal terms, but even that could be automated using a word2vec scheme and bootstrapping






      share|improve this answer






















        up vote
        0
        down vote










        up vote
        0
        down vote









        It seems what you are interested in doing is finding temporal statements in texts.



        Not sure of your final output, but let's assume you want to find temporal phrases or sentences which contain them.



        One methodology could be the following:



        1. Create list of temporal terms [days, years, months, now, later]

        2. Pick only sentences with key terms

        3. Use sentences in doc2vec model

        4. Infer vector and use distance metric for new sentence

          • GMM Cluster + Limit

          • Distance from average


        Another methodology could be:



        1. Create list of temporal terms [days, years, months, now, later]

        2. Do Bigram and Trigram collocation extraction

        3. Keep relevant collocations with temporal terms

        4. Use relevant collocations in a kind of bag-of-collocations approach

          • Matched binary feature vectors for relevant collocations

          • Train classifier to recognise higher level text


        This sounds like a good case for a Bootstrapping approach if you have large amounts of texts.



        Both are semi-supervised really, since there is some need for finding initial temporal terms, but even that could be automated using a word2vec scheme and bootstrapping






        share|improve this answer












        It seems what you are interested in doing is finding temporal statements in texts.



        Not sure of your final output, but let's assume you want to find temporal phrases or sentences which contain them.



        One methodology could be the following:



        1. Create list of temporal terms [days, years, months, now, later]

        2. Pick only sentences with key terms

        3. Use sentences in doc2vec model

        4. Infer vector and use distance metric for new sentence

          • GMM Cluster + Limit

          • Distance from average


        Another methodology could be:



        1. Create list of temporal terms [days, years, months, now, later]

        2. Do Bigram and Trigram collocation extraction

        3. Keep relevant collocations with temporal terms

        4. Use relevant collocations in a kind of bag-of-collocations approach

          • Matched binary feature vectors for relevant collocations

          • Train classifier to recognise higher level text


        This sounds like a good case for a Bootstrapping approach if you have large amounts of texts.



        Both are semi-supervised really, since there is some need for finding initial temporal terms, but even that could be automated using a word2vec scheme and bootstrapping







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 10 at 7:39









        Nathan McCoy

        1,0321125




        1,0321125



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53232894%2funsupervised-sentiment-analysis-using-doc2vec%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to how show current date and time by default on contact form 7 in WordPress without taking input from user in datetimepicker

            Syphilis

            Darth Vader #20