2 years ago

#56838

test-img

psysky

How create any metric descriptive statistics by group for all variables in R

Suppose i have these data

glucose=
structure(list(GR = c(1L, 2L, 1L, 2L, 1L, 2L, 2L, 2L, 1L, 1L, 
2L, 1L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 1L), glucose.1 = c(5.5, 4.77, 
5.52, 4.97, 4.4, 5.54, 4.85, 5.5, 5.5, 5.5, 5.09, 5.51, 5.5, 
5.5, 5.58, 5.58, 4.65, 5.5, 4.46, 5.43), glucose.2 = c(5.56, 
5.58, 5.58, 5.51, 5.5, 5.58, 5.5, 5.5, 5.52, 5.5, 5.49, 5.51, 
5.51, 5.56, 5.56, 5.5, 5.5, 5.58, 5.51, 5.53), glucose.3 = c(5.56, 
5.58, 5.58, 5.54, 5.57, 5.54, 5.53, 5.56, 5.51, 5.57, 5.54, 5.54, 
5.2, 5.26, 5.54, 5.55, 5.57, 5.25, 5.56, 5.54), glucose.4 = c(5.51, 
5.51, 5.53, 5.54, 5.52, 5.5, 5.51, 5.54, 4.99, 5.53, 5.51, 5.52, 
5.57, 5.54, 5.51, 5.58, 5.28, 5.51, 5.54, 5.54), glucose.5 = c(5.3, 
5.2, 5.51, 5.51, 5.51, 5.51, 5.51, 5.51, 5.4, 5.3, 5.51, 5.5, 
5.51, 5.55, 5.51, 5.51, 5.52, 5.51, 5.1, 5.42), glucose.6 = c(5.1, 
5.5, 5.45, 5.52, 5.32, 5.51, 5.45, 5.32, 5.57, 5.41, 5.54, 4.86, 
5.12, 5.54, 5.58, 5.32, 5.52, 5.04, 5.1, 5.5)), class = "data.frame", row.names = c(NA, 
-20L))

; to calculate descriptive statistics i can use such way

library(psych)
> describeBy(glucose.1 ~ GR,data=a)

result
 Descriptive statistics by group 
GR: 1
   vars  n mean   sd median trimmed  mad min  max range  skew
X1    1 10 5.32 0.42    5.5     5.4 0.07 4.4 5.58  1.18 -1.31
   kurtosis   se
X1    -0.14 0.13
------------------------------------------------ 
GR: 2
   vars  n mean   sd median trimmed  mad  min  max range  skew
X1    1 10 5.17 0.39    5.3    5.21 0.33 4.46 5.54  1.08 -0.43
   kurtosis   se
X1     -1.5 0.12

But it means that i must do this command for each variable , but i need for all variables at once, because it can be big count of variables. and the second this command describeBy provides many unnecessary statistics, such as trim and so on , but does not give those statistics that are needed, for example, the coefficient of variation (standard deviation divided by the mean in in percentage terms %)

So this is a question I really need help with. How to calculate these statistics separately for each group for all variables

count of obs
Mean
Median
Minimum
Maximum
25 percentile
75 percentile
Stdev
coef variation( %)

so that needed for me output was something like this(output was made manually as example. it was made not on my dput() but its not important, i just need such structure of output ) enter image description here

GR  glucose 1   glucose 2   glucose 3   glucose 4   glucose 5
gr=1    count of obs    33  33  31  31
(N = 33)    Mean    26,36   30,27   26,55   28,48
    Median  24  24  22  22
    Minimum 10  10  11  11
    Maximum 48  173 73  94
    25 percentile   32  35  33  30,5
    75 percentile   20  20  19  18,5
    Stdev   9,71    27,56   13,1    18,29
    coef variation( %)  36,82   91,03   49,36   64,2
gr=2    count of obs    33  33  32  32
(N = 33)    Mean    23,85   29,21   23,34   25,34
    Median  24  22  22  20,5
    Minimum 11  11  11  10
    Maximum 41  152 49  76
    25 percentile   31  32  28  31,25
    75 percentile   17  20  16,75   14,75
    Stdev   8,81    24,05   9,31    15,07
    coef variation( %)  36,95   82,34   39,88   59,45

How get such structure of output? Thank you for your help.

r

data.table

lapply

tidyr

0 Answers

Your Answer

Accepted video resources