python (12.9k questions)
javascript (9.2k questions)
reactjs (4.7k questions)
java (4.2k questions)
java (4.2k questions)
c# (3.5k questions)
c# (3.5k questions)
html (3.3k questions)
CUDA: Better performances with lower occupancy
I'm a CUDA learning student and I'm trying to write a CUDA algorithm for counting sort:
__global__ void kernelCountingSort(int *array, int dim_array, int *counts) {
// define index
int i = blo...

Roberto Falcone
Votes: 0
Answers: 1
CUDA: Why does kernel's execution time decreases if I allocate more threads in a block than the maximum number?
I'm a CUDA learning student and I'm trying to write a CUDA algorithm for counting sort. I tried to execute my kernel :
__global__ void kernelCountingSort(int *array, int dim_array, int *counts) {
...

Roberto Falcone
Votes: 0
Answers: 1
Using a loop in a CUDA graph
I have kernel A, B, and C which need to be executed sequentially.
A->B->C
They are executed in a while loop until some condition will be met.
while(predicate) {
A->B->C
}
The while lo...

Jakub Mitura
Votes: 0
Answers: 1
Can I reduce size of array in CUDA
I have allocated memory for 3d array using cudaMalloc3D - after execution of first kernel I established that I do not need part of it.
For example in pseudo code :
A = [100,100,100]
kernel()// data of...
Jakub Mitura
Votes: 0
Answers: 1