Class DMatrix represents a distributed matrix. More...
#include <dmatrix.h>
Public Member Functions | |
DMatrix () | |
Default constructor. | |
DMatrix (int n0, int m0, int rows, int cols, Distribution d=DIST) | |
Creates an empty distributed matrix with rows * cols elements. More... | |
DMatrix (int n0, int m0, int rows, int cols, const T &initial_value, Distribution d=DIST) | |
Creates a distributed matrix with rows * cols elements equal to initial_value. More... | |
DMatrix (int n0, int m0, int rows, int cols, T *const initial_matrix, Distribution d=DIST) | |
Creates a distributed matrix with rows * cols elements. Elements are copied from initial_array. Note that the length of initial_matrix must equal rows * cols. More... | |
DMatrix (int n0, int m0, int rows, int cols, T(*f)(int, int), Distribution d=DIST) | |
Creates a distributed matrix with rows * cols elements. Initializes all elements via the given function f. Note that global indices are pass to this function as arguments. More... | |
template<typename F2 > | |
DMatrix (int n0, int m0, int rows, int cols, const F2 &f, Distribution d=DIST) | |
Creates a distributed matrix with rows * cols elements. Initializes all elements via the given functor f. Note that global indices are pass to this function as arguments. More... | |
DMatrix (int n0, int m0) | |
Creates an empty copy distributed distributed matrix with rows * cols elements. More... | |
DMatrix (int n0, int m0, const T &initial_value) | |
Creates a copy distributed distributed matrix with rows * cols elements equal to initial_value. More... | |
DMatrix (int n0, int m0, T *const initial_matrix) | |
Creates a copy distributed distributed matrix with rows * cols elements. Elements are copied from initial_array. Note that the length of initial_matrix must equal rows * cols. More... | |
DMatrix (int n0, int m0, T(*f)(int, int)) | |
Creates a copy distributed distributed matrix with rows * cols elements. Initializes all elements via the given function f. Note that global indices are pass to this function as arguments. More... | |
template<typename F2 > | |
DMatrix (int n0, int m0, const F2 &f) | |
Creates a copy distributed distributed matrix with rows * cols elements. Initializes all elements via the given functor f. Note that global indices are pass to this function as arguments. More... | |
DMatrix (const DMatrix< T > &cs) | |
Copy constructor. | |
~DMatrix () | |
Destructor. | |
DMatrix< T > & | operator= (const DMatrix< T > &rhs) |
Assignment operator. | |
void | fill (const T &value) |
Initializes the elements of the distributed matrix with the value value. More... | |
void | fill (T *const values) |
Initializes the elements of the distributed matrix with the elements of the given array of values. Note that the length of values must match the size of the distributed matrix (not checked). More... | |
void | fill (T(*f)(int, int)) |
Initializes the elements of the distributed matrix via the given function f. Note that global indices are pass to this function as arguments. More... | |
template<typename F2 > | |
void | fill (const F2 &f) |
Initializes the elements of the distributed matrix via the given functor f. Note that global indices are pass to this functor as arguments. More... | |
template<typename MapFunctor > | |
void | mapInPlace (MapFunctor &f) |
Replaces each element m[i][j] of the distributed matrix with f(m[i][j]). More... | |
template<typename MapIndexFunctor > | |
void | mapIndexInPlace (MapIndexFunctor &f) |
Replaces each element m[i][j] of the distributed matrix with f(i, j, m[i][j]). Note that besides the element itself also its indices are passed to the functor. More... | |
template<typename R , typename MapFunctor > | |
msl::DMatrix< R > | map (MapFunctor &f) |
Returns a new distributed matrix with m_new[i][j] = f(m[i][j]). More... | |
template<typename R , typename MapIndexFunctor > | |
DMatrix< R > | mapIndex (MapIndexFunctor &f) |
Returns a new distributed matrix with m_new[i] = f(i, j, m[i][j]). Note that besides the element itself also its indices are passed to the functor. More... | |
template<typename MapStencilFunctor > | |
void | mapStencilInPlace (MapStencilFunctor &f, T neutral_value) |
Replaces each element m[i][j] of the distributed matrix with f(i, j, m). Note that the index i and the local partition is passed to the functor. More... | |
template<typename R , typename MapStencilFunctor > | |
DMatrix< R > | mapStencil (MapStencilFunctor &f, T neutral_value) |
Non-inplace variant of the mapStencil skeleton. More... | |
template<typename F > | |
void | mapInPlace (const msl::Fct1< T, T, F > &f) |
Replaces each element m[i][j] of the distributed matrix with f(m[i][j]). Note that this is a CPU only skeleton. More... | |
void | mapInPlace (T(*f)(T)) |
Replaces each element m[i][j] of the distributed matrix with f(m[i][j]). Note that this is a CPU only skeleton. More... | |
template<typename F > | |
void | mapIndexInPlace (const msl::Fct3< int, int, T, T, F > &f) |
Replaces each element m[i][j] of the distributed matrix with f(i, j, m[i][j]). Note that besides the element itself also its indices are passed to the functor. Also note that this is a CPU only skeleton. More... | |
void | mapIndexInPlace (T(*f)(int, int, T)) |
Replaces each element m[i][j] of the distributed array with f(i, j, m[i][j]). Note that besides the element itself also its indices are passed to the functor. Also note that this is a CPU only skeleton. More... | |
template<typename R , typename F > | |
msl::DMatrix< R > | map (const msl::Fct1< T, R, F > &f) |
Non-inplace variant of the map skeleton. Note that this is a CPU only skeleton. More... | |
template<typename R > | |
msl::DMatrix< R > | map (R(*f)(T)) |
Non-inplace variant of the map skeleton. Note that this is a CPU only skeleton. More... | |
template<typename R , typename F > | |
DMatrix< R > | mapIndex (const msl::Fct3< int, int, T, R, F > &f) |
Non-inplace variant of the mapIndex skeleton. Note that this is a CPU only skeleton. More... | |
template<typename R > | |
DMatrix< R > | mapIndex (R(*f)(int, int, T)) |
Non-inplace variant of the mapIndex skeleton. Note that this is a CPU only skeleton. More... | |
template<typename T2 , typename ZipFunctor > | |
void | zipInPlace (DMatrix< T2 > &b, ZipFunctor &f) |
Replaces each element m[i][j] of the distributed array with f(m[i][j], b[i][j]) with b being another distributed matrix of the same size. More... | |
template<typename T2 , typename ZipIndexFunctor > | |
void | zipIndexInPlace (DMatrix< T2 > &b, ZipIndexFunctor &f) |
Replaces each element m[i][j] of the distributed matrix with f(i, j, m[i][j], b[i][j]). Note that besides the elements themselves also the indices are passed to the functor. More... | |
template<typename R , typename T2 , typename ZipFunctor > | |
DMatrix< R > | zip (DMatrix< T2 > &b, ZipFunctor &f) |
Non-inplace variant of the zip skeleton. More... | |
template<typename R , typename T2 , typename ZipIndexFunctor > | |
DMatrix< R > | zipIndex (DMatrix< T2 > &b, ZipIndexFunctor &f) |
Non-inplace variant of the zipIndex skeleton. More... | |
template<typename T2 , typename F > | |
void | zipInPlace (DMatrix< T2 > &b, const Fct2< T, T2, T, F > &f) |
Replaces each element m[i][j] of the distributed matrix with f(m[i][j], b[i][j]) with b being another distributed matrix of the same size. Note that this is a CPU only skeleton. More... | |
template<typename T2 > | |
void | zipInPlace (DMatrix< T2 > &b, T(*f)(T, T2)) |
Replaces each element m[i][j] of the distributed matrix with f(m[i][j], b[i][j]) with b being another distributed matrix of the same size. Note that this is a CPU only skeleton. More... | |
template<typename T2 , typename F > | |
void | zipIndexInPlace (DMatrix< T2 > &b, const Fct4< int, int, T, T2, T, F > &f) |
Replaces each element m[i][j] of the distributed array with f(i, j, m[i][j], b[i][j]). Note that besides the elements themselves also the indices are passed to the functor. Note that this is a CPU only skeleton. More... | |
template<typename T2 > | |
void | zipIndexInPlace (DMatrix< T2 > &b, T(*f)(int, int, T, T2)) |
Replaces each element m[i][j] of the distributed array with f(i, j, m[i][j], b[i][j]). Note that besides the elements themselves also the indices are passed to the functor. Note that this is a CPU only skeleton. More... | |
template<typename R , typename T2 , typename F > | |
DMatrix< R > | zip (DMatrix< T2 > &b, const Fct2< T, T2, R, F > &f) |
Non-inplace variant of the zip skeleton. Note that this is a CPU only skeleton. More... | |
template<typename R , typename T2 > | |
DMatrix< R > | zip (DMatrix< T2 > &b, R(*f)(T, T2)) |
Non-inplace variant of the zip skeleton. Note that this is a CPU only skeleton. More... | |
template<typename R , typename T2 , typename F > | |
DMatrix< R > | zipIndex (DMatrix< T2 > &b, const Fct4< int, int, T, T2, R, F > &f) |
Non-inplace variant of the zipIndex skeleton. Note that this is a CPU only skeleton. More... | |
template<typename R , typename T2 > | |
DMatrix< R > | zipIndex (DMatrix< T2 > &b, R(*f)(int, int, T, T2)) |
Non-inplace variant of the zipIndex skeleton. Note that this is a CPU only skeleton. More... | |
template<typename FoldFunctor > | |
T | fold (FoldFunctor &f, bool final_fold_on_cpu=1) |
Reduces all elements of the distributed matrix to a single element by successively applying the given functor f. Note that f needs to be a commutative function. Note that this is a CPU only skeleton. More... | |
template<typename F > | |
T | fold (const Fct2< T, T, T, F > &f) |
Reduces all elements of the distributed matrix to a single element by successively applying the given functor f. Note that f needs to be a commutative function. Note that this is a CPU only skeleton. More... | |
T | fold (T(*f)(T, T)) |
Reduces all elements of the distributed matrix to a single element by successively applying the given function f. Note that f needs to be a commutative function. Note that this is a CPU only skeleton. More... | |
void | broadcastPartition (int blockRow, int blockCol) |
Broadcasts the partition with index (blockRow, blockCol to all processes. Afterwards, each partition of the distributed matrix stores the same values. Note that 0 <= blockRow < n and 0 <= blockCol < m. More... | |
void | gather (T **b) |
Transforms a distributed matrix to an ordinary (two-dimnesional) array by copying each element to the given (two-dimensional) array b. b must match the size of the distributed matrix. More... | |
void | gather (DMatrix< T > &dm) |
Transforms a distributed matrix to a copy distributed distributed matrix by copying each element to the given distributed matrix dm. dm must be copy distributed. More... | |
template<class F1 , class F2 > | |
void | permutePartition (const Fct2< int, int, int, F1 > &newRow, const Fct2< int, int, int, F2 > &newCol) |
Permutes the partitions of the distributed array according to the given functions newRow and newCol. Both functions must be bijective and return the new row/column index. Note that 0 <= newRow < blocksInCol and 0 <= newCol < blocksInRow. More... | |
void | permutePartition (int(*f)(int, int), int(*g)(int, int)) |
Permutes the partitions of the distributed array according to the given functions newRow and newCol. Both functions must be bijective and return the new row/column index. Note that 0 <= newRow < blocksInCol and 0 <= newCol < blocksInRow. More... | |
template<class F > | |
void | permutePartition (int(*f)(int, int), const Fct2< int, int, int, F > &g) |
Permutes the partitions of the distributed array according to the given functions f and g. Both functions must be bijective and return the new row/column index. Note that 0 <= f < blocksInCol and 0 <= g < blocksInRow. More... | |
template<class F > | |
void | permutePartition (const Fct2< int, int, int, F > &f, int(*g)(int, int)) |
Permutes the partitions of the distributed array according to the given functions f and g. Both functions must be bijective and return the new row/column index. Note that 0 <= f < g and 0 <= newCol < blocksInRow. More... | |
template<class F > | |
void | rotateCols (const Fct1< int, int, F > &f) |
Rotates the partitions of the distributed matrix cyclically in vertical direction. More... | |
void | rotateCols (int(*f)(int)) |
Rotates the partitions of the distributed matrix cyclically in vertical direction. More... | |
void | rotateCols (int rows) |
Rotates the partitions of the distributed matrix cyclically in vertical direction. More... | |
template<class F > | |
void | rotateRows (const Fct1< int, int, F > &f) |
Rotates the partitions of the distributed matrix cyclically in horizontal direction. More... | |
void | rotateRows (int(*f)(int)) |
Rotates the partitions of the distributed matrix cyclically in horizontal direction. More... | |
void | rotateRows (int cols) |
Rotates the partitions of the distributed matrix cyclically in horizontal direction. More... | |
void | transposeLocalPartition () |
Transposes the local partition. Currently only implemented for nLocal == mLocal. | |
T * | getLocalPartition () const |
Returns the local partition. More... | |
T | get (size_t row, size_t col) const |
Returns the element at the given global indices (row, col). More... | |
void | set (int row, int col, const T &v) |
Sets the element at the given global indices (row, col) to the given value v. More... | |
int | getFirstRow () const |
Returns the index of the first row of the local partition. More... | |
int | getFirstCol () const |
Returns the index of the first column of the local partition. More... | |
int | getLocalCols () const |
Returns the number of columns of the local partition. More... | |
int | getLocalRows () const |
Returns the number of rows of the local partition. More... | |
int | getLocalSize () const |
Returns the size of the local partition. More... | |
int | getRows () const |
Returns the number of rows of the distributed matrix. More... | |
int | getCols () const |
Returns the number of columns of the distributed matrix. More... | |
int | getBlocksInCol () const |
Returns the number of blocks (local partitions) in a column. More... | |
int | getBlocksInRow () const |
Returns the number of blocks (local partitions) in a row. More... | |
bool | isLocal (int row, int col) const |
Checks whether the element at the given global indices (row, col) is locally stored. More... | |
T | getLocal (int row, int col) const |
Returns the element at the given local indices (row, col). Note that 0 <= row < nLocal and 0 <= col < mLocal (will not be checked, for reasons of performance) More... | |
void | setLocal (int row, int col, const T &v) |
Sets the element at the given local indices (row, col) to the given value v. More... | |
std::vector< GPUExecutionPlan < T > > | getExecPlans () |
Returns the GPU execution plans that store information about size, etc. for the GPU partitions. For internal purposes. More... | |
void | setCopyDistribution () |
Switch the distribution scheme from distributed to copy distributed. | |
void | setDistribution (int rows, int cols) |
Switch the distribution scheme from copy distributed to distributed. Note that rows * cols = numProcesses must hold. More... | |
std::vector< T * > | upload (bool allocOnly=0) |
Manually upload the local partition to GPU memory. More... | |
void | download () |
Manually download the local partition from GPU memory. | |
void | freeDevice () |
Manually free device memory. | |
void | setGpuDistribution (Distribution dist) |
Set how the local partition is distributed among the GPUs. Current distribution schemes are: distributed, copy distributed. More... | |
Distribution | getGpuDistribution () |
Returns the current GPU distribution scheme. More... | |
void | show (const std::string &descr=std::string()) |
Prints the distributed array to standard output. Optionally, the user may pass a description that will be printed with the output. More... | |
void | printLocal () |
Each process prints its local partition of the distributed array. | |
Class DMatrix represents a distributed matrix.
A distributed matrix represents a parallel two-dimensional container and is distributed among all MPI processes the application was started with. It includes data parallel skeletons such as map, mapStencil, zip, and fold as well as variants of these skeletons.
T | Element type. Restricted to classes without pointer data members. |
msl::DMatrix< T >::DMatrix | ( | int | n0, |
int | m0, | ||
int | rows, | ||
int | cols, | ||
Distribution | d = DIST |
||
) |
Creates an empty distributed matrix with rows * cols elements.
n0 | Number of rows. |
m0 | Number of columns. |
rows | Number of blocks per column. |
cols | Number of blocks per row. |
d | Distribution of the distributed matrix. |
msl::DMatrix< T >::DMatrix | ( | int | n0, |
int | m0, | ||
int | rows, | ||
int | cols, | ||
const T & | initial_value, | ||
Distribution | d = DIST |
||
) |
Creates a distributed matrix with rows * cols elements equal to initial_value.
n0 | Number of rows. |
m0 | Number of columns. |
rows | Number of blocks per column. |
cols | Number of blocks per row. |
initial_value | Initial value for all elements. |
d | Distribution of the distributed matrix. |
msl::DMatrix< T >::DMatrix | ( | int | n0, |
int | m0, | ||
int | rows, | ||
int | cols, | ||
T *const | initial_matrix, | ||
Distribution | d = DIST |
||
) |
Creates a distributed matrix with rows * cols elements. Elements are copied from initial_array. Note that the length of initial_matrix must equal rows * cols.
n0 | Number of rows. |
m0 | Number of columns. |
rows | Number of blocks per column. |
cols | Number of blocks per row. |
initial_matrix | Initial matrix to copy elements from. |
d | Distribution of the distributed matrix. |
msl::DMatrix< T >::DMatrix | ( | int | n0, |
int | m0, | ||
int | rows, | ||
int | cols, | ||
T(*)(int, int) | f, | ||
Distribution | d = DIST |
||
) |
Creates a distributed matrix with rows * cols elements. Initializes all elements via the given function f. Note that global indices are pass to this function as arguments.
n0 | Number of rows. |
m0 | Number of columns. |
rows | Number of blocks per column. |
cols | Number of blocks per row. |
f | Function to initialize the elements of the distributed matrix. |
d | Distribution of the distributed matrix. |
msl::DMatrix< T >::DMatrix | ( | int | n0, |
int | m0, | ||
int | rows, | ||
int | cols, | ||
const F2 & | f, | ||
Distribution | d = DIST |
||
) |
Creates a distributed matrix with rows * cols elements. Initializes all elements via the given functor f. Note that global indices are pass to this function as arguments.
n0 | Number of rows. |
m0 | Number of columns. |
rows | Number of blocks per column. |
cols | Number of blocks per row. |
f | Functor to initialize the elements of the distributed matrix. |
d | Distribution of the distributed matrix. |
msl::DMatrix< T >::DMatrix | ( | int | n0, |
int | m0 | ||
) |
Creates an empty copy distributed distributed matrix with rows * cols elements.
n0 | Number of rows. |
m0 | Number of columns. |
msl::DMatrix< T >::DMatrix | ( | int | n0, |
int | m0, | ||
const T & | initial_value | ||
) |
Creates a copy distributed distributed matrix with rows * cols elements equal to initial_value.
n0 | Number of rows. |
m0 | Number of columns. |
initial_value | Initial value for all elements. |
msl::DMatrix< T >::DMatrix | ( | int | n0, |
int | m0, | ||
T *const | initial_matrix | ||
) |
Creates a copy distributed distributed matrix with rows * cols elements. Elements are copied from initial_array. Note that the length of initial_matrix must equal rows * cols.
n0 | Number of rows. |
m0 | Number of columns. |
initial_matrix | Initial matrix to copy elements from. |
msl::DMatrix< T >::DMatrix | ( | int | n0, |
int | m0, | ||
T(*)(int, int) | f | ||
) |
Creates a copy distributed distributed matrix with rows * cols elements. Initializes all elements via the given function f. Note that global indices are pass to this function as arguments.
n0 | Number of rows. |
m0 | Number of columns. |
f | Function to initialize the elements of the distributed matrix. |
msl::DMatrix< T >::DMatrix | ( | int | n0, |
int | m0, | ||
const F2 & | f | ||
) |
Creates a copy distributed distributed matrix with rows * cols elements. Initializes all elements via the given functor f. Note that global indices are pass to this function as arguments.
n0 | Number of rows. |
m0 | Number of columns. |
f | Functor to initialize the elements of the distributed matrix. |
void msl::DMatrix< T >::broadcastPartition | ( | int | blockRow, |
int | blockCol | ||
) |
Broadcasts the partition with index (blockRow, blockCol to all processes. Afterwards, each partition of the distributed matrix stores the same values. Note that 0 <= blockRow < n and 0 <= blockCol < m.
blockRow | The row index of the partition to broadcast. |
blockCol | The column index of the partition to broadcast. |
void msl::DMatrix< T >::fill | ( | const T & | value | ) |
Initializes the elements of the distributed matrix with the value value.
value | The value. |
void msl::DMatrix< T >::fill | ( | T *const | values | ) |
Initializes the elements of the distributed matrix with the elements of the given array of values. Note that the length of values must match the size of the distributed matrix (not checked).
values | The array of values. |
void msl::DMatrix< T >::fill | ( | T(*)(int, int) | f | ) |
Initializes the elements of the distributed matrix via the given function f. Note that global indices are pass to this function as arguments.
f | The initializer function. |
Initializes the elements of the distributed matrix via the given functor f. Note that global indices are pass to this functor as arguments.
f | The initializer functor. |
T msl::DMatrix< T >::fold | ( | FoldFunctor & | f, |
bool | final_fold_on_cpu = 1 |
||
) |
Reduces all elements of the distributed matrix to a single element by successively applying the given functor f. Note that f needs to be a commutative function. Note that this is a CPU only skeleton.
f | The fold functor, must be of type MFoldFunctor. |
final_fold_on_cpu | Specifies whether the final fold steps are done by the CPU. Default is true and since this is the CPU version of this skeleton, passing false will have no effect. |
T msl::DMatrix< T >::fold | ( | const Fct2< T, T, T, F > & | f | ) |
Reduces all elements of the distributed matrix to a single element by successively applying the given functor f. Note that f needs to be a commutative function. Note that this is a CPU only skeleton.
f | The fold functor, must be a 'curried' function pointer. |
T msl::DMatrix< T >::fold | ( | T(*)(T, T) | f | ) |
Reduces all elements of the distributed matrix to a single element by successively applying the given function f. Note that f needs to be a commutative function. Note that this is a CPU only skeleton.
f | The fold function. |
void msl::DMatrix< T >::gather | ( | T ** | b | ) |
Transforms a distributed matrix to an ordinary (two-dimnesional) array by copying each element to the given (two-dimensional) array b. b must match the size of the distributed matrix.
b | The (two-dimensional) array to store the elements of the distributed matrix. |
void msl::DMatrix< T >::gather | ( | DMatrix< T > & | dm | ) |
Transforms a distributed matrix to a copy distributed distributed matrix by copying each element to the given distributed matrix dm. dm must be copy distributed.
dm | The (copy distributed) distributed matrix to stores the elements of the distributed matrix. |
T msl::DMatrix< T >::get | ( | size_t | row, |
size_t | col | ||
) | const |
Returns the element at the given global indices (row, col).
row | The global row index. |
col | The global column index. |
int msl::DMatrix< T >::getBlocksInCol | ( | ) | const |
Returns the number of blocks (local partitions) in a column.
int msl::DMatrix< T >::getBlocksInRow | ( | ) | const |
Returns the number of blocks (local partitions) in a row.
int msl::DMatrix< T >::getCols | ( | ) | const |
Returns the number of columns of the distributed matrix.
std::vector<GPUExecutionPlan<T> > msl::DMatrix< T >::getExecPlans | ( | ) |
Returns the GPU execution plans that store information about size, etc. for the GPU partitions. For internal purposes.
int msl::DMatrix< T >::getFirstCol | ( | ) | const |
Returns the index of the first column of the local partition.
int msl::DMatrix< T >::getFirstRow | ( | ) | const |
Returns the index of the first row of the local partition.
Distribution msl::DMatrix< T >::getGpuDistribution | ( | ) |
Returns the current GPU distribution scheme.
T msl::DMatrix< T >::getLocal | ( | int | row, |
int | col | ||
) | const |
Returns the element at the given local indices (row, col). Note that 0 <= row < nLocal and 0 <= col < mLocal (will not be checked, for reasons of performance)
row | The local row index. |
col | The local column index. |
int msl::DMatrix< T >::getLocalCols | ( | ) | const |
Returns the number of columns of the local partition.
T* msl::DMatrix< T >::getLocalPartition | ( | ) | const |
Returns the local partition.
int msl::DMatrix< T >::getLocalRows | ( | ) | const |
Returns the number of rows of the local partition.
int msl::DMatrix< T >::getLocalSize | ( | ) | const |
Returns the size of the local partition.
int msl::DMatrix< T >::getRows | ( | ) | const |
Returns the number of rows of the distributed matrix.
bool msl::DMatrix< T >::isLocal | ( | int | row, |
int | col | ||
) | const |
Checks whether the element at the given global indices (row, col) is locally stored.
row | The global row index. |
col | The global column index. |
msl::DMatrix<R> msl::DMatrix< T >::map | ( | MapFunctor & | f | ) |
Returns a new distributed matrix with m_new[i][j] = f(m[i][j]).
f | The map functor, must be of type MMapFunctor. |
msl::DMatrix<R> msl::DMatrix< T >::map | ( | const msl::Fct1< T, R, F > & | f | ) |
Non-inplace variant of the map skeleton. Note that this is a CPU only skeleton.
f | The map functor, must be a 'curried' function pointer. |
Non-inplace variant of the map skeleton. Note that this is a CPU only skeleton.
f | The map function. |
DMatrix<R> msl::DMatrix< T >::mapIndex | ( | MapIndexFunctor & | f | ) |
Returns a new distributed matrix with m_new[i] = f(i, j, m[i][j]). Note that besides the element itself also its indices are passed to the functor.
f | The mapIndex functor, must be of type MMapIndexFunctor. |
DMatrix<R> msl::DMatrix< T >::mapIndex | ( | const msl::Fct3< int, int, T, R, F > & | f | ) |
Non-inplace variant of the mapIndex skeleton. Note that this is a CPU only skeleton.
f | The mapIndex functor, must be a 'curried' function pointer. |
DMatrix<R> msl::DMatrix< T >::mapIndex | ( | R(*)(int, int, T) | f | ) |
Non-inplace variant of the mapIndex skeleton. Note that this is a CPU only skeleton.
f | The mapIndex function. |
void msl::DMatrix< T >::mapIndexInPlace | ( | MapIndexFunctor & | f | ) |
Replaces each element m[i][j] of the distributed matrix with f(i, j, m[i][j]). Note that besides the element itself also its indices are passed to the functor.
f | The mapIndex functor, must be of type MMapIndexFunctor. |
void msl::DMatrix< T >::mapIndexInPlace | ( | const msl::Fct3< int, int, T, T, F > & | f | ) |
Replaces each element m[i][j] of the distributed matrix with f(i, j, m[i][j]). Note that besides the element itself also its indices are passed to the functor. Also note that this is a CPU only skeleton.
f | The mapIndex functor, must be a 'curried' function pointer. |
void msl::DMatrix< T >::mapIndexInPlace | ( | T(*)(int, int, T) | f | ) |
Replaces each element m[i][j] of the distributed array with f(i, j, m[i][j]). Note that besides the element itself also its indices are passed to the functor. Also note that this is a CPU only skeleton.
f | The mapIndex function. |
void msl::DMatrix< T >::mapInPlace | ( | MapFunctor & | f | ) |
Replaces each element m[i][j] of the distributed matrix with f(m[i][j]).
f | The map functor, must be of type MMapFunctor. |
void msl::DMatrix< T >::mapInPlace | ( | const msl::Fct1< T, T, F > & | f | ) |
Replaces each element m[i][j] of the distributed matrix with f(m[i][j]). Note that this is a CPU only skeleton.
f | The map functor, must be a 'curried' function pointer. |
void msl::DMatrix< T >::mapInPlace | ( | T(*)(T) | f | ) |
Replaces each element m[i][j] of the distributed matrix with f(m[i][j]). Note that this is a CPU only skeleton.
f | The mapIndex function. |
DMatrix<R> msl::DMatrix< T >::mapStencil | ( | MapStencilFunctor & | f, |
T | neutral_value | ||
) |
Non-inplace variant of the mapStencil skeleton.
f | The mapStencil functor, must be of type MMapStencilFunctor. |
void msl::DMatrix< T >::mapStencilInPlace | ( | MapStencilFunctor & | f, |
T | neutral_value | ||
) |
Replaces each element m[i][j] of the distributed matrix with f(i, j, m). Note that the index i and the local partition is passed to the functor.
f | The mapStencil functor, must be of type MMapStencilFunctor. |
void msl::DMatrix< T >::permutePartition | ( | const Fct2< int, int, int, F1 > & | newRow, |
const Fct2< int, int, int, F2 > & | newCol | ||
) |
Permutes the partitions of the distributed array according to the given functions newRow and newCol. Both functions must be bijective and return the new row/column index. Note that 0 <= newRow < blocksInCol and 0 <= newCol < blocksInRow.
newRow | The bijective function to calculate the new row index, must be a curried function pointer. |
newCol | The bijective function to calculate the new column index, must be a curried function pointer. |
void msl::DMatrix< T >::permutePartition | ( | int(*)(int, int) | f, |
int(*)(int, int) | g | ||
) |
Permutes the partitions of the distributed array according to the given functions newRow and newCol. Both functions must be bijective and return the new row/column index. Note that 0 <= newRow < blocksInCol and 0 <= newCol < blocksInRow.
newRow | The bijective function to calculate the new row index. |
newCol | The bijective function to calculate the new column index. |
void msl::DMatrix< T >::permutePartition | ( | int(*)(int, int) | f, |
const Fct2< int, int, int, F > & | g | ||
) |
Permutes the partitions of the distributed array according to the given functions f and g. Both functions must be bijective and return the new row/column index. Note that 0 <= f < blocksInCol and 0 <= g < blocksInRow.
f | The bijective function to calculate the new row index. |
g | The bijective function to calculate the new column index, must be a curried function pointer. |
void msl::DMatrix< T >::permutePartition | ( | const Fct2< int, int, int, F > & | f, |
int(*)(int, int) | g | ||
) |
Permutes the partitions of the distributed array according to the given functions f and g. Both functions must be bijective and return the new row/column index. Note that 0 <= f < g and 0 <= newCol < blocksInRow.
f | The bijective function to calculate the new row index, must be a curried function pointer. |
g | The bijective function to calculate the new column index. |
void msl::DMatrix< T >::rotateCols | ( | const Fct1< int, int, F > & | f | ) |
Rotates the partitions of the distributed matrix cyclically in vertical direction.
Rotates the partitions of the distributed matrix cyclically in vertical direction. The number of steps depends on the given function f that calculates this number for each column. Negative numbers correspond to cyclic rotations upwards, positive numbers correspond to cyclic rotations downward.
f | The function to calculate the number of steps, must be a curried function pointer. |
void msl::DMatrix< T >::rotateCols | ( | int(*)(int) | f | ) |
Rotates the partitions of the distributed matrix cyclically in vertical direction.
Rotates the partitions of the distributed matrix cyclically in vertical direction. The number of steps depends on the given function f that calculates this number for each column. Negative numbers correspond to cyclic rotations upwards, positive numbers correspond to cyclic rotations downward.
f | The function to calculate the number of steps. |
void msl::DMatrix< T >::rotateCols | ( | int | rows | ) |
Rotates the partitions of the distributed matrix cyclically in vertical direction.
Rotates the partitions of the distributed matrix cyclically in vertical direction. The number of steps is determined by rows. Negative numbers correspond to cyclic rotations upwards, positive numbers correspond to cyclic rotations downward.
rows | The number of steps to rotate. |
void msl::DMatrix< T >::rotateRows | ( | const Fct1< int, int, F > & | f | ) |
Rotates the partitions of the distributed matrix cyclically in horizontal direction.
Rotates the partitions of the distributed matrix cyclically in horizontal direction. The number of steps depends on the given function f that calculates this number for each row. Negative numbers correspond to cyclic rotations to the left, positive numbers correspond to cyclic rotations to the right.
f | The function to calculate the number of steps, must be a curried function pointer. |
void msl::DMatrix< T >::rotateRows | ( | int(*)(int) | f | ) |
Rotates the partitions of the distributed matrix cyclically in horizontal direction.
Rotates the partitions of the distributed matrix cyclically in horizontal direction. The number of steps depends on the given function f that calculates this number for each row. Negative numbers correspond to cyclic rotations to the left, positive numbers correspond to cyclic rotations to the right.
f | The function to calculate the number of steps. |
void msl::DMatrix< T >::rotateRows | ( | int | cols | ) |
Rotates the partitions of the distributed matrix cyclically in horizontal direction.
Rotates the partitions of the distributed matrix cyclically in horizontal direction. The number of steps is determined by rows. Negative numbers correspond to cyclic rotations to the left, positive numbers correspond to cyclic rotations to the right.
rows | The number of steps to rotate. |
void msl::DMatrix< T >::set | ( | int | row, |
int | col, | ||
const T & | v | ||
) |
Sets the element at the given global indices (row, col) to the given value v.
globalIndex | The global index. |
v | The new value. |
void msl::DMatrix< T >::setDistribution | ( | int | rows, |
int | cols | ||
) |
Switch the distribution scheme from copy distributed to distributed. Note that rows * cols = numProcesses must hold.
rows | The number of blocks per row. |
cols | The number of blocks per col. |
void msl::DMatrix< T >::setGpuDistribution | ( | Distribution | dist | ) |
Set how the local partition is distributed among the GPUs. Current distribution schemes are: distributed, copy distributed.
dist | The GPU distribution scheme. |
void msl::DMatrix< T >::setLocal | ( | int | row, |
int | col, | ||
const T & | v | ||
) |
Sets the element at the given local indices (row, col) to the given value v.
row | The local row index. |
col | The local column index. |
v | The new value. |
void msl::DMatrix< T >::show | ( | const std::string & | descr = std::string() | ) |
Prints the distributed array to standard output. Optionally, the user may pass a description that will be printed with the output.
descr | The description string. |
std::vector<T*> msl::DMatrix< T >::upload | ( | bool | allocOnly = 0 | ) |
Manually upload the local partition to GPU memory.
allocOnly | Specifies whether data is actually uploaded. |
DMatrix<R> msl::DMatrix< T >::zip | ( | DMatrix< T2 > & | b, |
ZipFunctor & | f | ||
) |
Non-inplace variant of the zip skeleton.
f | The zip functor, must be of type MZipFunctor. |
DMatrix<R> msl::DMatrix< T >::zip | ( | DMatrix< T2 > & | b, |
const Fct2< T, T2, R, F > & | f | ||
) |
Non-inplace variant of the zip skeleton. Note that this is a CPU only skeleton.
f | The zip functor, must be a 'curried' function pointer. |
DMatrix<R> msl::DMatrix< T >::zip | ( | DMatrix< T2 > & | b, |
R(*)(T, T2) | f | ||
) |
Non-inplace variant of the zip skeleton. Note that this is a CPU only skeleton.
f | The zip function. |
DMatrix<R> msl::DMatrix< T >::zipIndex | ( | DMatrix< T2 > & | b, |
ZipIndexFunctor & | f | ||
) |
Non-inplace variant of the zipIndex skeleton.
f | The zipIndex functor, must be of type MZipIndexFunctor. |
DMatrix<R> msl::DMatrix< T >::zipIndex | ( | DMatrix< T2 > & | b, |
const Fct4< int, int, T, T2, R, F > & | f | ||
) |
Non-inplace variant of the zipIndex skeleton. Note that this is a CPU only skeleton.
f | The zipIndex functor, must be a 'curried' function pointer. |
DMatrix<R> msl::DMatrix< T >::zipIndex | ( | DMatrix< T2 > & | b, |
R(*)(int, int, T, T2) | f | ||
) |
Non-inplace variant of the zipIndex skeleton. Note that this is a CPU only skeleton.
f | The zipIndex function. |
void msl::DMatrix< T >::zipIndexInPlace | ( | DMatrix< T2 > & | b, |
ZipIndexFunctor & | f | ||
) |
Replaces each element m[i][j] of the distributed matrix with f(i, j, m[i][j], b[i][j]). Note that besides the elements themselves also the indices are passed to the functor.
f | The zipIndex functor, must be of type MZipIndexFunctor. |
void msl::DMatrix< T >::zipIndexInPlace | ( | DMatrix< T2 > & | b, |
const Fct4< int, int, T, T2, T, F > & | f | ||
) |
Replaces each element m[i][j] of the distributed array with f(i, j, m[i][j], b[i][j]). Note that besides the elements themselves also the indices are passed to the functor. Note that this is a CPU only skeleton.
f | The zipIndex functor, must be a 'curried' function pointer. |
void msl::DMatrix< T >::zipIndexInPlace | ( | DMatrix< T2 > & | b, |
T(*)(int, int, T, T2) | f | ||
) |
Replaces each element m[i][j] of the distributed array with f(i, j, m[i][j], b[i][j]). Note that besides the elements themselves also the indices are passed to the functor. Note that this is a CPU only skeleton.
f | The zipIndex function. |
void msl::DMatrix< T >::zipInPlace | ( | DMatrix< T2 > & | b, |
ZipFunctor & | f | ||
) |
Replaces each element m[i][j] of the distributed array with f(m[i][j], b[i][j]) with b being another distributed matrix of the same size.
f | The zip functor, must be of type MZipFunctor. |
void msl::DMatrix< T >::zipInPlace | ( | DMatrix< T2 > & | b, |
const Fct2< T, T2, T, F > & | f | ||
) |
Replaces each element m[i][j] of the distributed matrix with f(m[i][j], b[i][j]) with b being another distributed matrix of the same size. Note that this is a CPU only skeleton.
f | The zip functor, must be a 'curried' function pointer. |
void msl::DMatrix< T >::zipInPlace | ( | DMatrix< T2 > & | b, |
T(*)(T, T2) | f | ||
) |
Replaces each element m[i][j] of the distributed matrix with f(m[i][j], b[i][j]) with b being another distributed matrix of the same size. Note that this is a CPU only skeleton.
f | The zip function. |