2 years ago

#62637

test-img

Samet Turgut

Object 'pos.words' not found error while doing Sentiment Analysis in R

I am preparing a text mining project by getting tweets from Twitter. I have done the first part and the second part of the project is sentiment analysis. I really don't know how to do sentiment analysis, so I have found some codes on the internet. They were using these codes with English positive and negative words but I have my own Turkish positive and negative words and I want to use them instead. tweet_clean is my clean tweet dataset, so I didn't make any changes to the code except that, but it doesn't work. I'm getting this error;

Error in FUN(X[[i]], ...) : object 'pos.words' not found

after running;

analysis <- score.sentiment(tweet_clean, pos.words, neg.words)

And the following codes after that don't work as well.

Please can you tell me what's wrong with these codes as if you are talking to a dumb person?

positive <- scan('C:/Users/samet/Desktop/twitter_steam_text_mining/pozitifkelimeler.txt', what = 'character', comment.char = ';', skipNul = TRUE)
negative <- scan('C:/Users/samet/Desktop/twitter_steam_text_mining/negatifkelimeler.txt', what = 'character', comment.char = ';', skipNul = TRUE)


score.sentiment <- function(sentences, pos.words, neg.words, .progress='none')
  
{
  require(plyr)
  require(stringr)
  
  scores <- laply(sentences, function(sentence, pos.words, neg.words)
    
    {
    
    # clean up sentences with R's regex-driven global substitute, gsub() function:
    sentence <- gsub('https://','',sentence)
    sentence <- gsub('http://','',sentence)
    sentence <- gsub('[^[:graph:]]', ' ',sentence)
    sentence <- gsub('[[:punct:]]', '', sentence)
    sentence <- gsub('[[:cntrl:]]', '', sentence)
    sentence <- gsub('\\d+', '', sentence)
    sentence <- str_replace_all(sentence,"[^[:graph:]]", " ")
    # and convert to lower case:
    sentence <- tolower(sentence)
    
    # split into words. str_split is in the stringr package
    word.list <- str_split(sentence, '\\s+')
    # sometimes a list() is one level of hierarchy too much
    words <- unlist(word.list)
    
    # compare our words to the dictionaries of positive & negative terms
    pos.matches <- match(words, pos.words)
    neg.matches <- match(words, neg.words)
    
    # match() returns the position of the matched term or NA
    # we just want a TRUE/FALSE:
    pos.matches <- !is.na(pos.matches)
    neg.matches <- !is.na(neg.matches)
    
    # TRUE/FALSE will be treated as 1/0 by sum():
    score <- sum(pos.matches) - sum(neg.matches)
    
    return(score)
  }, pos.words, neg.words, .progress=.progress )
  
  scores.df <- data.frame(score=scores, text=sentences)
  return(scores.df)
}

analysis <- score.sentiment(tweet_clean, pos.words, neg.words)
# sentiment score frequency table
table(analysis$score)


analysis %>%
  ggplot(aes(x=score)) + 
  geom_histogram(binwidth = 1, fill = "lightblue")+ 
  ylab("Frequency") + 
  xlab("sentiment score") +
  ggtitle("Distribution of Sentiment scores of the tweets") +
  ggeasy::easy_center_title()


neutral <- length(which(analysis$score == 0))
positive <- length(which(analysis$score > 0))
negative <- length(which(analysis$score < 0))
Sentiment <- c("Positive","Neutral","Negative")
Count <- c(positive,neutral,negative)
output <- data.frame(Sentiment,Count)
output$Sentiment<-factor(output$Sentiment,levels=Sentiment)
ggplot(output, aes(x=Sentiment,y=Count))+
  geom_bar(stat = "identity", aes(fill = Sentiment))+
  ggtitle("Barplot of Sentiment type of 4000 tweets")

r

text-mining

sentiment-analysis

0 Answers

Your Answer

Accepted video resources