2 years ago
#67095
rcr_aagr
Apply a function of every row in a dataset using elements of a second dataset in R
I have two datasets, the firs one called activos_full has this structure:
"usuario","name"
"BotInventory","InventoryBot (PC parts, Ryzen, RTX, console, toys)"
"BotConsoles","Inventory Bot Consoles (PS5, Xbox, Switch)"
"CopWeGo","CopWeGo"
"bmurphypointman","Brett Murphy"
"ThinGreenLine","The Thin Green Line"
"BlutrauschSSB","Blutroast"
"imas96","imas69"
"nintendo_hall","Nintendo_Hall"
"Orchids_School","Orchids International School"
"SmurfnGear","Smurf Gear"
And the second one called timelines (this is just a small sample):
"screen_name","reply_to_status_id","is_retweet","favorite_count","retweet_count","reply_count"
"BotInventory",NA,FALSE,0,0,NA
"BotInventory",NA,FALSE,0,0,NA
"BotInventory",NA,FALSE,0,0,NA
"BotInventory",NA,FALSE,0,0,NA
"BotInventory",NA,FALSE,0,0,NA
"BotInventory",NA,FALSE,0,0,NA
"BotInventory",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"BotConsoles",NA,FALSE,1,0,NA
"BotConsoles",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,1,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,1,NA
"bmurphypointman",NA,FALSE,1,1,NA
"bmurphypointman",NA,FALSE,0,1,NA
"bmurphypointman",NA,FALSE,0,1,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,0,NA
"bmurphypointman",NA,FALSE,0,0,NA
The idea is to go row by row from the first dataset and with a function make some calculations and save as list the results in a new column of the first dataset.
The function I use for the calculation is:
tweetstype <- function(timelines)
{
tweet_df <- timelines[activos_full$screen_name, ]
tweet_df_organic <- tweet_df[tweet_df$is_retweet==FALSE, ] # Remove replies
tweet_df_organic <- subset(tweet_df_organic, is.na(tweet_df_organic$reply_to_status_id))
tweet_df_organic <- tweet_df_organic %>% arrange(-favorite_count)
tweet_df_organic <- tweet_df_organic %>% arrange(-retweet_count)
# Keeping only the retweet_df
retweet_df <- tweet_df[tweet_df$is_retweet==TRUE,]# Keeping only the replies
replies <- subset(tweet_df, !is.na(tweet_df$reply_to_status_id))
# Creating a data frame
data <- data.frame(
category=c("Organic", "Retweet_df", "Replies"),
count=c(nrow(tweet_df_organic), nrow(retweet_df), nrow(replies))
)
# Adding columns
data$fraction = data$count / sum(data$count)
data$percentage = data$count / sum(data$count) * 100
# data$ymax = cumsum(data$fraction)
# data$ymin = c(0, head(data$ymax, n=-1))
# Rounding the data to two decimal points
data <- round_df(data, 2)
return(list(data))
}
I send the second dataset as parameter of the function and then I tried to call the function like this:
activos_full$results <- apply(activos_full,1,tweetstype(timelines))
But I got this error even if the function was already created:
Error in match.fun(FUN) :
'tweetstype(timelines)' is not a function, character or symbol
r
function
dataset
combinations
apply
r
function
dataset
combinations
apply
0 Answers
Your Answer