2 years ago
#62637

Samet Turgut
Object 'pos.words' not found error while doing Sentiment Analysis in R
I am preparing a text mining project by getting tweets from Twitter. I have done the first part and the second part of the project is sentiment analysis. I really don't know how to do sentiment analysis, so I have found some codes on the internet. They were using these codes with English positive and negative words but I have my own Turkish positive and negative words and I want to use them instead. tweet_clean is my clean tweet dataset, so I didn't make any changes to the code except that, but it doesn't work. I'm getting this error;
Error in FUN(X[[i]], ...) : object 'pos.words' not found
after running;
analysis <- score.sentiment(tweet_clean, pos.words, neg.words)
And the following codes after that don't work as well.
Please can you tell me what's wrong with these codes as if you are talking to a dumb person?
positive <- scan('C:/Users/samet/Desktop/twitter_steam_text_mining/pozitifkelimeler.txt', what = 'character', comment.char = ';', skipNul = TRUE)
negative <- scan('C:/Users/samet/Desktop/twitter_steam_text_mining/negatifkelimeler.txt', what = 'character', comment.char = ';', skipNul = TRUE)
score.sentiment <- function(sentences, pos.words, neg.words, .progress='none')
{
require(plyr)
require(stringr)
scores <- laply(sentences, function(sentence, pos.words, neg.words)
{
# clean up sentences with R's regex-driven global substitute, gsub() function:
sentence <- gsub('https://','',sentence)
sentence <- gsub('http://','',sentence)
sentence <- gsub('[^[:graph:]]', ' ',sentence)
sentence <- gsub('[[:punct:]]', '', sentence)
sentence <- gsub('[[:cntrl:]]', '', sentence)
sentence <- gsub('\\d+', '', sentence)
sentence <- str_replace_all(sentence,"[^[:graph:]]", " ")
# and convert to lower case:
sentence <- tolower(sentence)
# split into words. str_split is in the stringr package
word.list <- str_split(sentence, '\\s+')
# sometimes a list() is one level of hierarchy too much
words <- unlist(word.list)
# compare our words to the dictionaries of positive & negative terms
pos.matches <- match(words, pos.words)
neg.matches <- match(words, neg.words)
# match() returns the position of the matched term or NA
# we just want a TRUE/FALSE:
pos.matches <- !is.na(pos.matches)
neg.matches <- !is.na(neg.matches)
# TRUE/FALSE will be treated as 1/0 by sum():
score <- sum(pos.matches) - sum(neg.matches)
return(score)
}, pos.words, neg.words, .progress=.progress )
scores.df <- data.frame(score=scores, text=sentences)
return(scores.df)
}
analysis <- score.sentiment(tweet_clean, pos.words, neg.words)
# sentiment score frequency table
table(analysis$score)
analysis %>%
ggplot(aes(x=score)) +
geom_histogram(binwidth = 1, fill = "lightblue")+
ylab("Frequency") +
xlab("sentiment score") +
ggtitle("Distribution of Sentiment scores of the tweets") +
ggeasy::easy_center_title()
neutral <- length(which(analysis$score == 0))
positive <- length(which(analysis$score > 0))
negative <- length(which(analysis$score < 0))
Sentiment <- c("Positive","Neutral","Negative")
Count <- c(positive,neutral,negative)
output <- data.frame(Sentiment,Count)
output$Sentiment<-factor(output$Sentiment,levels=Sentiment)
ggplot(output, aes(x=Sentiment,y=Count))+
geom_bar(stat = "identity", aes(fill = Sentiment))+
ggtitle("Barplot of Sentiment type of 4000 tweets")
r
text-mining
sentiment-analysis
0 Answers
Your Answer