2 years ago

#67095

test-img

rcr_aagr

Apply a function of every row in a dataset using elements of a second dataset in R

I have two datasets, the firs one called activos_full has this structure:

"usuario","name"
"BotInventory","InventoryBot (PC parts, Ryzen, RTX, console, toys)"
"BotConsoles","Inventory Bot Consoles (PS5, Xbox, Switch)"
"CopWeGo","CopWeGo"
"bmurphypointman","Brett Murphy"
"ThinGreenLine","The Thin Green Line"
"BlutrauschSSB","Blutroast"
"imas96","imas69"
"nintendo_hall","Nintendo_Hall"
"Orchids_School","Orchids International School"
"SmurfnGear","Smurf Gear"

And the second one called timelines (this is just a small sample):

"screen_name","reply_to_status_id","is_retweet","favorite_count","retweet_count","reply_count"
"BotInventory",NA,FALSE,0,0,NA
"BotInventory",NA,FALSE,0,0,NA
"BotInventory",NA,FALSE,0,0,NA
"BotInventory",NA,FALSE,0,0,NA
"BotInventory",NA,FALSE,0,0,NA
"BotInventory",NA,FALSE,0,0,NA
"BotInventory",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,1,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,1,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,1,NA
"bmurphypointman",NA,FALSE,1,1,NA
"bmurphypointman",NA,FALSE,0,1,NA
"bmurphypointman",NA,FALSE,0,1,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,0,NA

The idea is to go row by row from the first dataset and with a function make some calculations and save as list the results in a new column of the first dataset.

The function I use for the calculation is:

tweetstype <- function(timelines)
{

  tweet_df <- timelines[activos_full$screen_name, ]
  tweet_df_organic <- tweet_df[tweet_df$is_retweet==FALSE, ] # Remove replies
  tweet_df_organic <- subset(tweet_df_organic, is.na(tweet_df_organic$reply_to_status_id)) 
  
  tweet_df_organic <- tweet_df_organic %>% arrange(-favorite_count)
  
  tweet_df_organic <- tweet_df_organic %>% arrange(-retweet_count)
  
  
  # Keeping only the retweet_df
  retweet_df <- tweet_df[tweet_df$is_retweet==TRUE,]# Keeping only the replies
  replies <- subset(tweet_df, !is.na(tweet_df$reply_to_status_id))
  
  # Creating a data frame
  data <- data.frame(
    category=c("Organic", "Retweet_df", "Replies"),
    count=c(nrow(tweet_df_organic), nrow(retweet_df), nrow(replies))
  )
  
  
  # Adding columns 
  data$fraction = data$count / sum(data$count)
  data$percentage = data$count / sum(data$count) * 100
  # data$ymax = cumsum(data$fraction)
  # data$ymin = c(0, head(data$ymax, n=-1))
  
  # Rounding the data to two decimal points
  data <- round_df(data, 2)
   return(list(data))
  
}

I send the second dataset as parameter of the function and then I tried to call the function like this:

activos_full$results <- apply(activos_full,1,tweetstype(timelines))

But I got this error even if the function was already created:

Error in match.fun(FUN) : 
  'tweetstype(timelines)' is not a function, character or symbol

r

function

dataset

combinations

apply

r

function

dataset

combinations

apply

0 Answers

Your Answer

Accepted video resources