Unexpected results with Memory Coalescing 

Hi, I am using the following system configuration:
- Windows 10
- Visual Studio 2019 Community 
- Cuda 10.2
- Nvidia Nsight Compute 2019.5.0
- Nvidia RTX 2060 GPU (Turing Architecture)


I am following your tutorials on [YouTube](https://youtu.be/_qSP455IekE) and used the file [alignment_matrix_mul.cu](https://github.com/CoffeeBeforeArch/cuda_programming/blob/v1/matrixMul/alignment_matrix_mul/alignment_matrix_mul/alignment_matrix_mul.cu), in three configuartions:
- No transpose (just as we were doing it before)
- Transpose `a` matrix (`temp_sum += a[k * n + row] * b[col + n * k];`)
- Transpose `b` matrix (`temp_sum += a[k + n * row] * b[col * n + k];`)

We would expect that the GPU would perform best when we transpose matrix `a`, as the memory accesses for each thread are coalesced in this way, but the profiling shows that it performs better when I transpose matrix `b`. 

The only thing that I am doing different here is that I am using Nsight Compute as a separate application to profile the built binary from Visual Studio and not the inbuilt extension. I am also attaching the performance images I got:
- No Transpose: https://drive.google.com/file/d/18-l8W3csIjCRRoxgASsWevjRIV9hINXp/view?usp=sharing
- Transpose `a` matrix: https://drive.google.com/file/d/1rPwMpalSwfVpZ8-jBpO3ROL1R7POAzRt/view?usp=sharing
- Transpose `b` matrix: https://drive.google.com/file/d/1WHIQBRRk1KjJk5MXVUc4AopGzqWPDwFh/view?usp=sharing

I have double checked the transpositions and this is what I get. Can there be any other bottleneck causing these results? i.e. the cost of fetching multiple elements for the loop (index `k`) overpowers the coalesced access?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected results with Memory Coalescing #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Unexpected results with Memory Coalescing #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions