c++ - In-Place CUDA Kernel for Rectangular Matrix Transpose -

- February 15, 2011

I have looked around for a while, but unable to find a suitable answer for this:

Is there any implementation for mutilated matrix transfer in CUDA?

I know about cublas geam , but there is a need to create another matrix for this: I tried a simple implementation from this:

However, this only works for class matrix. Can someone explain to me why this argument does not work at all for the diagonal matrix? Although it is not in place, however, the 'inexperienced' approach works for the transfer.

The following paper:

The sequential algorithm for in-place matrix transactions is as follows (> O (nm) run time):

  // in: n rows; I colle // out: n cols; I lines zero metrics_transize (int * a, int n, int m) {int i, j; For (int k = 0; k & lt; n * m; k ++) {int idx = k; Doing {// Calculate the index in the original array idx = (idx% n) * m + (idx / n); } While (idx  k); // Make sure that we do not swap std :: swap (a [k], an [idx]) elements twice; }}

Search This Blog

Contra

c++ - In-Place CUDA Kernel for Rectangular Matrix Transpose -

Comments

Post a Comment

Popular posts from this blog

winforms - C# Form - Property Change -

c# - NewtonSoft JArray - how to select multiple elements with LINQ -

javascript - amcharts makechart not working -