This page contains a list of software developed as a part of the ARGO project.

Argobots

Argobots is a lightweight, low-level threading and tasking framework to support massive on-node parallelism. It provides high-level runtimes and domain-specific libraries with threading and tasking mechanisms so that they can build their solutions efficiently. Argobots supports two kinds of work units called user-level threads and tasklets. It also exposes hardware resources (e.g., cores or hardware threads) as execution streams (ESs) and provides mapping mechanisms between work units and ESs.

Software can be found at : http://www.argobots.org

For more information, contact Sangmin Seo (sseo@anl.gov)

Argobots-aware MPI

As core number of many-core processors keeps increasing, MPI+X is becoming a promising programming model for large scale SMP clusters. It has the potential to utilizing both intra-node and inter-node parallelism with appropriate execution unit and granularity. Argobots-aware MPI runtime takes advantage of Argobots in order to provide asynchrony/overlap to MPI. The idea is to make multiple MPI blocking calls at the same time in multiple ULTs, if one MPI call is blocked in ULT A, the MPI runtime will detect it and context switch to another ULT to make progress on other blocking calls. Once other ULTs finished their execution, they will switch back to ULT A to continue its execution. In this way, we can keep the CPU busy doing useful work instead of waiting the blocking call. However, the two-level parallelism of MPI+X introduces new problems such as lock contention in MPI between threads. To avoid unnecessary locks between execution units, our Argobots-aware MPI runtime will explicitly control the context switch between ULTs and Execution Streams (ESs). When switching between ULTs in the same ES, no lock is needed.

Software can be found at : https://wiki.mpich.org/mpich/index.php/MPI%2BArgobots

For more information, contact Abdelhalim Amer (aamer@anl.gov), Sangmin Seo (sseo@anl.gov)

BEACON

The Backplane for Event And Control Notification (BEACON) provides interfaces  for gathering event data, based on which components can take appropriate action. BEACON is a lightweight framework that provides  interfaces for sharing event information  as well as other supplementary services, in the ARGO system.

Software can be found at : http://git.mcs.anl.gov/argo/beacon-backplane.git

For more information, contact Rinku Gupta (rgupta@mcs.anl.gov)

BOLT

BOLT is a recursive acronym that stands for “BOLT is OpenMP over Lightweight Threads”. BOLT targets a high-performing OpenMP implementation, especially specialized for fine-grain parallelism. Unlike other OpenMP implementations, BOLT utilizes a lightweight threading model for its underlying threading mechanism. It currently adopts Argobots, a new holistic, low-level threading and tasking runtime, in order to overcome shortcomings of conventional OS-level threads. Its runtime is based on the OpenMP runtime in LLVM.

Software can be found at :  http://www.bolt-omp.org

For more information, contact Sangmin Seo (sseo@anl.gov)

CilkBots

CilkBots (Cilk over Argobots) extends the MIT Cilk runtime to support inter-operable execution and dynamic resource management using Argobots’ lightweight infrastructure.  Also, CilkBots exploits efficient scheduling in Argobots to optimize cache locality at runtime to achieve performance that is comparable the performance achieved by  state-of-the-art compile-time polyhedral optimizations.

Software can be found at :  The release version is under development. For access to code, please contact Sriram K (sriram@pnnl.gov)

For more information, contact Sriram K (sriram@pnnl.gov)

DI-MMAP

DI-MMAP is a an OS/R subsystem optimized to efficiently integrate node-local, block accessible NVRAM into the memory hierarchy. DI-MMAP provides a transparent DRAM cache for memory-mapped NVRAM regions that supports high concurrency and large out-of-core data structures. Key features include a user visible introspection interface that allows applications to make runtime decisions based on data residency and support for non-native “superpage” sizes that optimize data transport.

Software can be found at :  https://bitbucket.org/vanessen/di-mmap

For more information, contact Brian C. Van Essen (vanessen1@llnl.gov)

EXPOSE

The EXPOSE (Exascale Performance and Observability Environment) is a framework for developing in situ data introspection and analysis services. EXPOSE interfaces to BEACON for retrieving data on subscribed topics of interest and publishing results, control, and other data back to consumers. Different services can be developed for specific purposes within the exascale environment, depending on the requirements of the problem.

Software can be found at :  http://tau.uoregon.edu/

For more information, contact Sameer Shende (sameer@cs.uoregon.edu)

Libmsr

Libmsr is an API designed to simplify the use of model specific registers (MSRs) on Intel platforms. It provides a simple interface through function calls and data structures that enable easy access without the need to understand the complex internal structure of MSRs and CSRs. This gives users easy access to hardware performance counters as well as more advanced functionality such as Intel’s RAPL as well as thermal management features. Libmsr should be used in conduction with the kernel modules MSR-SAFE to prevent system instability.

Software can be found at : https://github.com/LLNL/libmsr

For more information, contact Barry Rountree (rountree4@llnl.gov)

LEO

The LEO software deals with pareto-optimal power/performance tradeoffs at the node-level

Software can be found at :  https://leo.cs.uchicago.edu

For more information, contact Hank Hoffman (hankhoffmann@cs.uchicago.edu)

MSR-SAFE

MSR-SAFE is a kernel module that can replace the original MSR Linux kernel. It provides the same access to MSR registers, but with additional security access features: only explicitly white-listed MSRs can be accessed to avoid accidental or targeted system corruption.

Software can be found at : https://github.com/LLNL/msr-safe

For more information, contact Barry Rountree (rountree4@llnl.gov)

NodeOS

Argo NodeOS is a Linux-based operating system running on each node of an Argo machine. It features improvements in terms of resource partitioning, memory management, and scheduling.

Software can be found at : http://git.mcs.anl.gov/argo/nodeos-sources.git

The master branch of the repository is limited to auxiliary utilities; kernel sources are available from other branches (consult README.md for details).

For more information, contact Kamil Iskra (iskra@mcs.anl.gov)

PaRSEC

PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed many-core heterogeneous architectures. Applications are expressed as a Direct Acyclic Graph of tasks with labeled edges designating data dependencies. The graph description is provided by specialized Domain Specific Languages, allowing highly optimized and memory compact, possibly problem-size independent, formats that can be queried on-demand to discover data dependencies in a totally distributed fashion. PaRSEC handles the data transfers between nodes, coherency and consistency between multiple data copies, and handles the support for architectural heterogeneity. PaRSEC provides the developer with a portable view of the execution platform, while allowing for straightforward use of the local resources by assigning computation threads to cores, overlaps communications and computations and uses a dynamic, fully-distributed scheduler based on architectural features such as NUMA nodes and algorithmic features such as data reuse.

The PaRSEC framework includes libraries, a runtime system, and development tools to help application developers tackle the difficult task of porting their applications to highly heterogeneous and diverse environments.

In ARGO, PaRSEC implements the Pluggable Task Graph Engine, using its various DSLs to dynamically submit tasks, and schedule them over the physical resources shared with the other programming models implemented over ARGObots.

Software can be found at :  http://icl.cs.utk.edu/parsec/

For more information, contact George Bosilca (bosilca@icl.utk.edu)

POET/Bard

The POET/Bard software – given a model of power/performance tradeoffs meets performance or power constraints optimally

Software can be found at :   https://poet.cs.uchicago.edu

For more information, contact Hank Hoffman (hankhoffmann@cs.uchicago.edu)

PuPIL

The PuPIL maximizes performance under a node-level power cap with no prior knowledge of an application

Software can be found at : https://github.com/PUPiL2015/PUPIL

For more information, contact Hank Hoffman (hankhoffmann@cs.uchicago.edu)

TASCEL

TASCEL (pronounced “tassel”) is a framework to study the design of algorithms associated with finer-grained concurrency abstractions. It uses an active message framework built on MPI. The active message framework enables the design of supporting algorithms that are concurrent with ongoing execution (e.g., load balancing or fault recovery concurrent with application execution). TASCEL supports various threading modes, progress semantics, together with SPMD and non-SPMD execution. TASCEL has been used to design algorithms to trace work stealing schedulers, perform incremental load balancing through retentive work stealing, and support localized fault recovery. Also, TASCEL is being used in the design of application components in bioinformatics and computational chemistry.

TASCEL over Argobots demonstrates the inter-operability and flexible resource allocation while ensuring load balanced execution to completion for the task-parallel subcomputation. This forms the basis for effective design of communication overlap strategies in applications using TASCEL.

Webpage: http://hpc.pnl.gov/tascel/ (code to be released in the coming months)

For more information, contact Sriram K (sriram@pnnl.gov)

Comments are closed.