Unsupervised sentiment Analysis using doc2vec
up vote
2
down vote
favorite
Folks,
I have searched Google for different type of papers/blogs/tutorials etc but haven't found anything helpful. I would appreciate if anyone can help me. Please note that I am not asking for code step-by-step but rather an idea/blog/paper or some tutorial.
Here's my problem statement:
Just like sentiment analysis is used for identifying positive and
negative tone of a sentence, I want to find whether a sentence is
forward-looking (future outlook) statement or not.
I do not want to use bag of words approach to sum up the number of forward-looking words/phrases such as "going forward", "in near future" or "In 5 years from now" etc. I am not sure if word2vec or doc2vec can be used. Please enlighten me.
Thanks.
nlp gensim word2vec sentiment-analysis doc2vec
add a comment |
up vote
2
down vote
favorite
Folks,
I have searched Google for different type of papers/blogs/tutorials etc but haven't found anything helpful. I would appreciate if anyone can help me. Please note that I am not asking for code step-by-step but rather an idea/blog/paper or some tutorial.
Here's my problem statement:
Just like sentiment analysis is used for identifying positive and
negative tone of a sentence, I want to find whether a sentence is
forward-looking (future outlook) statement or not.
I do not want to use bag of words approach to sum up the number of forward-looking words/phrases such as "going forward", "in near future" or "In 5 years from now" etc. I am not sure if word2vec or doc2vec can be used. Please enlighten me.
Thanks.
nlp gensim word2vec sentiment-analysis doc2vec
Why don't you want to use a bag-of-words technique based on words/phrases that appear in such statements? It might work well! Similarly, some approach using word2vec/doc2vec embeddings might prove helpful – you'd have to try it. What have you tried so far? What kind of training dataset do you have, or expect to be able to create?
– gojomo
Nov 10 at 0:54
add a comment |
up vote
2
down vote
favorite
up vote
2
down vote
favorite
Folks,
I have searched Google for different type of papers/blogs/tutorials etc but haven't found anything helpful. I would appreciate if anyone can help me. Please note that I am not asking for code step-by-step but rather an idea/blog/paper or some tutorial.
Here's my problem statement:
Just like sentiment analysis is used for identifying positive and
negative tone of a sentence, I want to find whether a sentence is
forward-looking (future outlook) statement or not.
I do not want to use bag of words approach to sum up the number of forward-looking words/phrases such as "going forward", "in near future" or "In 5 years from now" etc. I am not sure if word2vec or doc2vec can be used. Please enlighten me.
Thanks.
nlp gensim word2vec sentiment-analysis doc2vec
Folks,
I have searched Google for different type of papers/blogs/tutorials etc but haven't found anything helpful. I would appreciate if anyone can help me. Please note that I am not asking for code step-by-step but rather an idea/blog/paper or some tutorial.
Here's my problem statement:
Just like sentiment analysis is used for identifying positive and
negative tone of a sentence, I want to find whether a sentence is
forward-looking (future outlook) statement or not.
I do not want to use bag of words approach to sum up the number of forward-looking words/phrases such as "going forward", "in near future" or "In 5 years from now" etc. I am not sure if word2vec or doc2vec can be used. Please enlighten me.
Thanks.
nlp gensim word2vec sentiment-analysis doc2vec
nlp gensim word2vec sentiment-analysis doc2vec
asked Nov 9 at 20:32
sgokhales
34.9k26105138
34.9k26105138
Why don't you want to use a bag-of-words technique based on words/phrases that appear in such statements? It might work well! Similarly, some approach using word2vec/doc2vec embeddings might prove helpful – you'd have to try it. What have you tried so far? What kind of training dataset do you have, or expect to be able to create?
– gojomo
Nov 10 at 0:54
add a comment |
Why don't you want to use a bag-of-words technique based on words/phrases that appear in such statements? It might work well! Similarly, some approach using word2vec/doc2vec embeddings might prove helpful – you'd have to try it. What have you tried so far? What kind of training dataset do you have, or expect to be able to create?
– gojomo
Nov 10 at 0:54
Why don't you want to use a bag-of-words technique based on words/phrases that appear in such statements? It might work well! Similarly, some approach using word2vec/doc2vec embeddings might prove helpful – you'd have to try it. What have you tried so far? What kind of training dataset do you have, or expect to be able to create?
– gojomo
Nov 10 at 0:54
Why don't you want to use a bag-of-words technique based on words/phrases that appear in such statements? It might work well! Similarly, some approach using word2vec/doc2vec embeddings might prove helpful – you'd have to try it. What have you tried so far? What kind of training dataset do you have, or expect to be able to create?
– gojomo
Nov 10 at 0:54
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
It seems what you are interested in doing is finding temporal statements in texts.
Not sure of your final output, but let's assume you want to find temporal phrases or sentences which contain them.
One methodology could be the following:
- Create list of temporal terms [days, years, months, now, later]
- Pick only sentences with key terms
- Use sentences in doc2vec model
- Infer vector and use distance metric for new sentence
- GMM Cluster + Limit
- Distance from average
Another methodology could be:
- Create list of temporal terms [days, years, months, now, later]
- Do Bigram and Trigram collocation extraction
- Keep relevant collocations with temporal terms
- Use relevant collocations in a kind of
bag-of-collocations
approach- Matched binary feature vectors for relevant collocations
- Train classifier to recognise higher level text
This sounds like a good case for a Bootstrapping approach if you have large amounts of texts.
Both are semi-supervised really, since there is some need for finding initial temporal terms, but even that could be automated using a word2vec scheme and bootstrapping
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
It seems what you are interested in doing is finding temporal statements in texts.
Not sure of your final output, but let's assume you want to find temporal phrases or sentences which contain them.
One methodology could be the following:
- Create list of temporal terms [days, years, months, now, later]
- Pick only sentences with key terms
- Use sentences in doc2vec model
- Infer vector and use distance metric for new sentence
- GMM Cluster + Limit
- Distance from average
Another methodology could be:
- Create list of temporal terms [days, years, months, now, later]
- Do Bigram and Trigram collocation extraction
- Keep relevant collocations with temporal terms
- Use relevant collocations in a kind of
bag-of-collocations
approach- Matched binary feature vectors for relevant collocations
- Train classifier to recognise higher level text
This sounds like a good case for a Bootstrapping approach if you have large amounts of texts.
Both are semi-supervised really, since there is some need for finding initial temporal terms, but even that could be automated using a word2vec scheme and bootstrapping
add a comment |
up vote
0
down vote
It seems what you are interested in doing is finding temporal statements in texts.
Not sure of your final output, but let's assume you want to find temporal phrases or sentences which contain them.
One methodology could be the following:
- Create list of temporal terms [days, years, months, now, later]
- Pick only sentences with key terms
- Use sentences in doc2vec model
- Infer vector and use distance metric for new sentence
- GMM Cluster + Limit
- Distance from average
Another methodology could be:
- Create list of temporal terms [days, years, months, now, later]
- Do Bigram and Trigram collocation extraction
- Keep relevant collocations with temporal terms
- Use relevant collocations in a kind of
bag-of-collocations
approach- Matched binary feature vectors for relevant collocations
- Train classifier to recognise higher level text
This sounds like a good case for a Bootstrapping approach if you have large amounts of texts.
Both are semi-supervised really, since there is some need for finding initial temporal terms, but even that could be automated using a word2vec scheme and bootstrapping
add a comment |
up vote
0
down vote
up vote
0
down vote
It seems what you are interested in doing is finding temporal statements in texts.
Not sure of your final output, but let's assume you want to find temporal phrases or sentences which contain them.
One methodology could be the following:
- Create list of temporal terms [days, years, months, now, later]
- Pick only sentences with key terms
- Use sentences in doc2vec model
- Infer vector and use distance metric for new sentence
- GMM Cluster + Limit
- Distance from average
Another methodology could be:
- Create list of temporal terms [days, years, months, now, later]
- Do Bigram and Trigram collocation extraction
- Keep relevant collocations with temporal terms
- Use relevant collocations in a kind of
bag-of-collocations
approach- Matched binary feature vectors for relevant collocations
- Train classifier to recognise higher level text
This sounds like a good case for a Bootstrapping approach if you have large amounts of texts.
Both are semi-supervised really, since there is some need for finding initial temporal terms, but even that could be automated using a word2vec scheme and bootstrapping
It seems what you are interested in doing is finding temporal statements in texts.
Not sure of your final output, but let's assume you want to find temporal phrases or sentences which contain them.
One methodology could be the following:
- Create list of temporal terms [days, years, months, now, later]
- Pick only sentences with key terms
- Use sentences in doc2vec model
- Infer vector and use distance metric for new sentence
- GMM Cluster + Limit
- Distance from average
Another methodology could be:
- Create list of temporal terms [days, years, months, now, later]
- Do Bigram and Trigram collocation extraction
- Keep relevant collocations with temporal terms
- Use relevant collocations in a kind of
bag-of-collocations
approach- Matched binary feature vectors for relevant collocations
- Train classifier to recognise higher level text
This sounds like a good case for a Bootstrapping approach if you have large amounts of texts.
Both are semi-supervised really, since there is some need for finding initial temporal terms, but even that could be automated using a word2vec scheme and bootstrapping
answered Nov 10 at 7:39
Nathan McCoy
1,0321125
1,0321125
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53232894%2funsupervised-sentiment-analysis-using-doc2vec%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Why don't you want to use a bag-of-words technique based on words/phrases that appear in such statements? It might work well! Similarly, some approach using word2vec/doc2vec embeddings might prove helpful – you'd have to try it. What have you tried so far? What kind of training dataset do you have, or expect to be able to create?
– gojomo
Nov 10 at 0:54