This website requires JavaScript.
Explore
Help
Sign In
cwpearson
/
astaroth
Watch
1
Star
0
Fork
0
You've already forked astaroth
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
cc9d3f1b9cdeec06959cbd43ff9793430e2a3b36
astaroth
/
samples
History
jpekkila
cc9d3f1b9c
Found a workaround that gives good inter and intra-node performance. HPC-X MPI implementation does not know how to do p2p comm with pinned arrays (should be 80 GiB/s, measured 10 GiB/s) and internode comm is super slow without pinned arrays (should be 40 GiB/s, measured < 1 GiB/s). Made a proof of concept communicator that pins arrays that are send or received from another node.
2020-04-05 20:15:32 +03:00
..
benchmark
It might be better to benchmark MPI codes without synchronization because of overhead of timing individual steps
2020-03-31 12:37:54 +02:00
bwtest
Found a workaround that gives good inter and intra-node performance. HPC-X MPI implementation does not know how to do p2p comm with pinned arrays (should be 80 GiB/s, measured 10 GiB/s) and internode comm is super slow without pinned arrays (should be 40 GiB/s, measured < 1 GiB/s). Made a proof of concept communicator that pins arrays that are send or received from another node.
2020-04-05 20:15:32 +03:00
cpptest
Setting inv_dsx etc explicitly is no longer required as they are set to default values in acc/stdlib/stdderiv.h
2020-01-28 18:22:27 +02:00
ctest
Major improvement: uniforms can now be set to default values. The syntax is the same as for setting any other values, f.ex. 'uniform Scalar a = 1; uniform Scalar b = 0.5 * a;'. Undefined uniforms are still allowed, but in this case the user should load a proper value into it during runtime. Default uniform values can be overwritten by calling any of the uniform loader funcions (like acDeviceLoadScalarUniform). Improved also error checking. Now there are explicit warnings if the user tries to load an invalid value into a device constant.
2020-01-28 18:17:31 +02:00
mpitest
Found a workaround that gives good inter and intra-node performance. HPC-X MPI implementation does not know how to do p2p comm with pinned arrays (should be 80 GiB/s, measured 10 GiB/s) and internode comm is super slow without pinned arrays (should be 40 GiB/s, measured < 1 GiB/s). Made a proof of concept communicator that pins arrays that are send or received from another node.
2020-04-05 20:15:32 +03:00