2 years ago

#75006

test-img

geedigit

Multithreading/parallelization for MGCV summary function

I am fitting a gam using the MGCV package in R with the bam function (~60,000 samples) using 3 fixed effects and one random effect. The number of factors for the random effect is very high (several thousand). The model fits after ~4 hours. Until now, I have been running the summary function with the re.test argument set to false to examine the fixed effects. However, now I need to summarise the random effects and it takes a very long time (>24 hours).

I have a multicore CPU (64 cores). Is there any way to exploit parallelisation/multithreading for the summary.gam function much the same as the bam and predict.gam functions?

Model fit code is:

gam_model = bam(y ~ s(x1, k = -1) + s(x2, k = -1) + s(x3, k = -1) + 
                    s(ID, bs = 're'),
                  family = 'gaussian',
                  data = dat,
                  method = "fREML",
                  select = FALSE,
                  nthreads = 64,
                  discrete = TRUE,
                  control = ctrl)

Setup specs are:

platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          6.3                         
year           2020                        
month          02                          
day            29                          
svn rev        77875                       
language       R                           
version.string R version 3.6.3 (2020-02-29)
nickname       Holding the Windsock 

I have browsed the summary.gam documentation and haven't found anything mentioning using more than one core for the summary.gam function.

r

parallel-processing

mgcv

0 Answers

Your Answer

Accepted video resources