Namespace msl is the main namespace of Muesli. More...
Namespaces | |
detail | |
Namespace detail contains internally used classes. The user of Muesli should not get in touch with the contents of this namespace. | |
Classes | |
class | ArgumentType |
Base class for argument types of functors. More... | |
class | DArray |
Class DArray represents a distributed array. More... | |
class | Distribution |
class | CopyDistribution |
class | BlockDistribution |
class | RowDistribution |
class | ColDistribution |
class | DMatrix |
Class DMatrix represents a distributed matrix. More... | |
class | Farm |
class | Final |
class | MMapFunctor |
Class MMapFunctor represents a functor for the map skeleton of the distributed matrix. More... | |
class | MMapIndexFunctor |
Class MMapIndexFunctor represents a functor for the mapIndex skeleton of the distributed matrix. More... | |
class | PLMatrix |
Class PLMatrix represents a padded local matrix (partition). It serves as input for the mapStencil skeleton and actually is a shallow copy that only stores the pointers to the data. The data itself is managed by the mapStencil skeleton. For the user, the only important part is the get function. More... | |
class | MMapStencilFunctor |
Class MMapStencilFunctor represents a functor for the mapStencil skeleton of the distributed matrix. More... | |
class | MZipFunctor |
Class MZipFunctor represents a functor for the zip skeleton of the distributed matrix. More... | |
class | MZipIndexFunctor |
Class MZipIndexFunctor represents a functor for the zipIndex skeleton of the distributed matrix. More... | |
class | MFoldFunctor |
Class MFoldFunctor represents a functor for the fold skeleton of the distributed matrix. More... | |
class | AMapFunctor |
Class AMapFunctor represents a functor for the fold skeleton of the distributed array. More... | |
class | AMapIndexFunctor |
Class AMapIndexFunctor represents a functor for the mapIndex skeleton of the distributed array. More... | |
class | PLArray |
Class PLArray represents a padded local array (partition). It serves as input for the mapStencil skeleton and actually is a shallow copy that only stores the pointers to the data. The data itself is managed by the mapStencil skeleton. For the user, the only important part is the get function. More... | |
class | AMapStencilFunctor |
Class AMapStencilFunctor represents a functor for the mapStencil skeleton of the distributed array. More... | |
class | AZipFunctor |
Class AZipFunctor represents a functor for the zip skeleton of the distributed array. More... | |
class | AZipIndexFunctor |
Class AZipIndexFunctor represents a functor for the zipIndex skeleton of the distributed array. More... | |
class | AFoldFunctor |
Class AFoldFunctor represents a functor for the fold skeleton of the distributed array. More... | |
class | FarmFunctor |
Class FarmFunctor represents a functor for the farm skeleton. More... | |
class | Initial |
class | LArray |
Class LArray represents a shallow copy of class DArray. More... | |
class | LMatrix |
Class LMatrix represents a shallow copy of class DMatrix. More... | |
class | Muesli |
Class Muesli contains globally available variables that determine the properties (number of running processes, threads, etc.) of the Muesli application. More... | |
class | Pipe |
class | Rng |
Class Rng represents a pseudo random number generator that can be called by both the CPU and the GPU. Uses std::default_random_engine and std::uniform_real_distribution for the CPU side, and thrust::default_random_engine and thrust::uniform_real_distribution on the GPU side. More... | |
class | Timer |
Class Timer for timing purposes. More... | |
Typedefs | |
typedef int | ProcessorNo |
Typedef for process numbers. | |
Enumerations | |
enum | Distribution { DIST, COPY } |
Enum Distribution to represent the distribution mode of distributed data structures. More... | |
Functions | |
void | initSkeletons (int argc, char **argv, bool debug_communication=0) |
Initializes Muesli. Needs to be called before any skeleton is used. | |
void | terminateSkeletons () |
Terminates Muesli. Needs to be called at the end of a Muesli application. | |
void | printv (const char *format,...) |
Wrapper for printf. Only process with id 0 prints the given format string. | |
void | setNumThreads (int num_threads) |
Sets the number of CPU threads. More... | |
void | setNumRuns (int num_runs) |
Sets the number of runs for a benchmark application. More... | |
void | setNumGpus (int num_gpus) |
Sets the number of GPUs to be used by each process. More... | |
void | setThreadsPerBlock (int threads_per_block) |
Sets the number of threads per (one dimensional) block. Note that threads_per_block <= 1024. More... | |
void | setThreadsPerBlock (int tpbX, int tpbY) |
Sets the number of threads per (two dimensional) block. Note that tpbX * tpbY <= 1024. More... | |
void | setNumConcurrentKernels (int num_kernels) |
Sets the number of concurrent kernels per GPU. Only for the farm skeleton. More... | |
void | setTaskGroupSize (int size) |
Sets the task group size (i.e. size of sets to be processed) for the heterogeneous farm skeleton. More... | |
void | syncStreams () |
Synchronizes the CUDA streams. | |
void | startTiming () |
Starts timing. | |
void | splitTime (int run) |
Prints the time elapsed since last split time. | |
double | stopTiming () |
Ends timing. More... | |
bool | isRootProcess () |
Checks whether this is process with id 0. More... | |
void | setFarmStatistics (bool val) |
Switches on or off (depending on the value of val) collecting farm statistics. | |
MSL_USERFUNC size_t | getUniqueID () |
Returns a unique thread id. More... | |
template<typename T > | |
T | getNegativeInfinity () |
Returns the value which represents the negative infinity for the given type T. In case the given type has no representation for infinity, the minimum value is returned. More... | |
template<typename T > | |
T | getPositiveInfinity () |
Returns the value which represents the positive infinity for the given type T. In case the given type has no representation for infinity, the maximum value is returned. More... | |
void | MSL_SendTag (int destination, int tag) |
Sends a message without content. Mainly used for control messages such as stop messages. More... | |
void | MSL_ReceiveTag (int source, int tag) |
Receives a message without content. Mainly used for control messages such as stop messages. More... | |
template<typename T > | |
void | MSL_Send (int destination, T *send_buffer, size_t size, int tag=MYTAG) |
Sends a buffer of type T to process destination. More... | |
template<typename T > | |
void | MSL_ISend (int destination, T *send_buffer, MPI_Request &req, size_t size, int tag=MYTAG) |
Sends (non-blocking) a buffer of type T to process destination. More... | |
template<typename T > | |
void | MSL_Recv (int source, T *recv_buffer, size_t size, int tag=MYTAG) |
Receives a buffer of type T from process source. More... | |
template<typename T > | |
void | MSL_Recv (int source, T *recv_buffer, MPI_Status &stat, size_t size, int tag=MYTAG) |
Receives a buffer of type T from process source. More... | |
template<typename T > | |
void | MSL_IRecv (int source, T *recv_buffer, MPI_Request &req, size_t size, int tag=MYTAG) |
Receives (non-blockig) a buffer of type T from process source. More... | |
template<typename T > | |
void | MSL_SendReceive (int destination, T *send_buffer, T *recv_buffer, size_t size=1) |
template<typename T > | |
void | broadcast (T *buffer, int *const ids, int np, int idRoot, size_t count) |
Implementation of the MPI_Broadcast routine. Only the processes in ids participate. More... | |
template<typename T > | |
void | allgather (T *send_buffer, T *recv_buffer, int *const ids, int np, size_t count) |
Implementation of the MPI_Allgather routine. Only the processes in ȩm ids participate. More... | |
template<typename T > | |
void | allgather (T *send_buffer, T *recv_buffer, size_t count) |
Wrapper for the MPI_Allgather routine. Every process in MPI_COMM WORLD participates. More... | |
template<typename T > | |
void | MSL_Broadcast (int source, T *buffer, int size) |
Wrapper for the MPI_Broadcast routine. Every process in MPI_COMM WORLD participates. More... | |
template<typename T > | |
void | MSL_Send (int destination, std::vector< T > &send_buffer, int tag=MYTAG) |
Sends a std::vector of type T to process destination. More... | |
template<typename T > | |
void | MSL_Recv (int source, std::vector< T > &recv_buffer, int tag=MYTAG) |
Receives a std::vector of type T from process source. More... | |
void | fail_exit () |
Used to quit the program on failure, must be used after initSkeletons() | |
void | throws (const detail::Exception &e) |
Throws an Exception. More... | |
template<typename C1 , typename C2 > | |
C1 | proj1_2 (C1 a, C2 b) |
template<typename C1 , typename C2 > | |
C2 | proj2_2 (C1 a, C2 b) |
template<typename F > | |
int | auxRotateRows (const Fct1< int, int, F > &f, int blocks, int row, int col) |
template<typename F > | |
int | auxRotateCols (const Fct1< int, int, F > &f, int blocks, int row, int col) |
template<typename T > | |
void | show (T *a, int size) |
Namespace msl is the main namespace of Muesli.
enum msl::Distribution |
Enum Distribution to represent the distribution mode of distributed data structures.
Enum Distribution to represent the distribution mode of a distributed data structure. Currently there are two distribution modes: distributed and copy distributed. In distributed mode, each process/GPU stores only a partition of the entire data structure. In copy distributed mode, each process/GPU stores the entire data structure.
void msl::allgather | ( | T * | send_buffer, |
T * | recv_buffer, | ||
int *const | ids, | ||
int | np, | ||
size_t | count | ||
) |
Implementation of the MPI_Allgather routine. Only the processes in ȩm ids participate.
send_buffer | Send buffer. |
recv_buffer | Receive buffer. |
ids | The process ids that participate in broadcasting. |
np | Number of processes that participate. |
count | Number of elements in send_buffer. |
T | Type of the message. |
void msl::allgather | ( | T * | send_buffer, |
T * | recv_buffer, | ||
size_t | count | ||
) |
Wrapper for the MPI_Allgather routine. Every process in MPI_COMM WORLD participates.
send_buffer | Send buffer. |
recv_buffer | Receive buffer. |
count | Number of elements in send_buffer. |
T | Type of the message. |
void msl::broadcast | ( | T * | buffer, |
int *const | ids, | ||
int | np, | ||
int | idRoot, | ||
size_t | count | ||
) |
Implementation of the MPI_Broadcast routine. Only the processes in ids participate.
buffer | Message buffer. |
ids | The process ids that participate in broadcasting. |
np | Number of processes that participate. |
idRoot | Root process id of the broadcast. |
count | Number of elements in buffer. |
T | Type of the message. |
T msl::getNegativeInfinity | ( | ) |
Returns the value which represents the negative infinity for the given type T. In case the given type has no representation for infinity, the minimum value is returned.
T | The type for which negative infinity shall be determined. |
T msl::getPositiveInfinity | ( | ) |
Returns the value which represents the positive infinity for the given type T. In case the given type has no representation for infinity, the maximum value is returned.
T | The type for which positive infinity shall be determined. |
|
inline |
Returns a unique thread id.
bool msl::isRootProcess | ( | ) |
Checks whether this is process with id 0.
|
inline |
Wrapper for the MPI_Broadcast routine. Every process in MPI_COMM WORLD participates.
source | Root process id of the broadcast. |
buffer | The message buffer. |
size | Number of elements to broadcast. |
T | Type of the message. |
|
inline |
Receives (non-blockig) a buffer of type T from process source.
source | The source process id. |
recv_buffer | The receive buffer. |
req | MPI request to check for completion. |
size | Size (number of elements) of the message. |
tag | Message tag. |
T | Type of the message. |
|
inline |
Sends (non-blocking) a buffer of type T to process destination.
destination | The destination process id. |
send_buffer | The send buffer. |
req | MPI request to check for completion. |
size | Size (number of elements) of the message. |
tag | Message tag. |
T | Type of the message. |
|
inline |
Receives a message without content. Mainly used for control messages such as stop messages.
source | The source process id of the message. |
tag | Message tag. |
|
inline |
Receives a buffer of type T from process source.
source | The source process id. |
recv_buffer | The receive buffer. |
size | Size (number of elements) of the message. |
tag | Message tag. |
T | Type of the message. |
|
inline |
Receives a buffer of type T from process source.
source | The source process id. |
recv_buffer | The receive buffer. |
stat | MPI status to check for completion. |
size | Size (number of elements) of the message. |
tag | Message tag. |
T | Type of the message. |
|
inline |
Receives a std::vector of type T from process source.
source | The source process id. |
send_buffer | The receive buffer. |
tag | Message tag. |
T | Type of the message. |
|
inline |
Sends a buffer of type T to process destination.
destination | The destination process id. |
send_buffer | The send buffer. |
size | Size (number of elements) of the message. |
tag | Message tag. |
T | Type of the message. |
|
inline |
Sends a std::vector of type T to process destination.
destination | The destination process id. |
send_buffer | The send buffer. |
tag | Message tag. |
T | Type of the message. |
|
inline |
Sends a message without content. Mainly used for control messages such as stop messages.
destination | The destination process id of the message. |
tag | Message tag. |
void msl::setNumConcurrentKernels | ( | int | num_kernels | ) |
Sets the number of concurrent kernels per GPU. Only for the farm skeleton.
num_kernels | The number of concurrent kernels per GPU. |
void msl::setNumGpus | ( | int | num_gpus | ) |
Sets the number of GPUs to be used by each process.
num_gpus | The number of GPUs to be used by each process. |
void msl::setNumRuns | ( | int | num_runs | ) |
Sets the number of runs for a benchmark application.
num_runs | The number of runs for a benchmark application. |
void msl::setNumThreads | ( | int | num_threads | ) |
Sets the number of CPU threads.
num_threads | The number of CPU threads. |
void msl::setTaskGroupSize | ( | int | size | ) |
Sets the task group size (i.e. size of sets to be processed) for the heterogeneous farm skeleton.
size | The task group size. |
void msl::setThreadsPerBlock | ( | int | threads_per_block | ) |
Sets the number of threads per (one dimensional) block. Note that threads_per_block <= 1024.
threads_per_block | The number of threads per block. |
void msl::setThreadsPerBlock | ( | int | tpbX, |
int | tpbY | ||
) |
Sets the number of threads per (two dimensional) block. Note that tpbX * tpbY <= 1024.
tpbX | The number of threads in x dimension. |
tpbY | The number of threads in y dimension. |
double msl::stopTiming | ( | ) |
Ends timing.
void msl::throws | ( | const detail::Exception & | e | ) |
Throws an Exception.
e | The exception to throw. |