Node-Aware Stencil Communication for Heterogeneous Supercomputers |
C3SR Bi-weekly Technical Seminar |
|
Coordinated Science Lab 216 |
street |
city |
region |
postcode |
country |
1308 W Main St |
Urbana |
IL |
61801 |
United States |
|
Optimizing Multi-GPU Stencil Communication |
High-performance distributed computing systems increasingly feature nodes that have multiple CPU sockets and multiple GPUs. The communication bandwidth between these components is non-uniform. Furthermore, these systems can expose different communication capabilities between these components. For communication-heavy applications, optimally using these capabilities is challenging and essential for performance.
This work presents approaches for automatic data placement and communication implementation for 3D stencil codes on multi-GPU nodes with non-homogeneous communication performance and capabilities. Benchmarking results in the Summit system show that choices in placement can result in a 20% improvements in single-node exchange, and communication specialization can yield a further 6x improvement in exchange time in a single node, and a 16% improvement at 1536 GPUs. |
2020-02-28T14:00:00Z |
2020-02-28T15:00:00Z |
false |
2020-03-15T00:00:00Z |
|
|
true |
|
|
|
|
pdf/20200228_c3sr_stencil.pdf |
|
example |
|
true |