python (12.9k questions)
javascript (9.2k questions)
reactjs (4.7k questions)
java (4.2k questions)
java (4.2k questions)
c# (3.5k questions)
c# (3.5k questions)
html (3.3k questions)
Determining CUDA compute capability as constexpr for __launch_bounds__
In order to launch a CUDA kernel efficiently I'd like to use __launch_bounds__ with arguments that depend on the maximal threads per SM allowed in the current GPU, which in turn depends on that GPU's ...
Michael
Votes: 0
Answers: 0
Julia CUDA - Saving intermediate kernel results without CPU
Consider the following CUDA kernel, which computes the mean of each row of a 2-D matrix.
using CUDA
function mean!(x, n, out)
"""out = sum(x, dims=2)"""
row_idx ...
A is for Ambition
Votes: 0
Answers: 1
Strange errors of compiling pcl1.8 with cuda 11.3
System: cuda 11.3, gcc 7.5, boost 1.65.1, pcl 1.8.0
When I compile code that uses PCL library, it shows the following error
/usr/include/pcl-1.8/pcl/io/file_io.h(264): error: namespace "boost&quo...
picklesmithy129
Votes: 0
Answers: 1
Memory padding vs coalesced access
I have a little confusion about bank conflicts, avoiding them using memory padding and coalesced memory access. What I've read so far: Coalesced memory access from global memory is optimal. If it isn'...
SimonH
Votes: 0
Answers: 1