Commit Graph

450 Commits

Author SHA1 Message Date
jpekkila 5e1500fe97 Happy new year! :) 2020-01-13 21:38:07 +02:00
jpekkila 92a6a1bdec Added more professional run flags to ./ac_run 2020-01-13 15:35:01 +02:00
jpekkila 794e4393c3 Added a new function for the legacy Astaroth layer: acGetNode(). This functions returns a Node, which can be used to access acNode layer functions 2020-01-13 11:33:15 +02:00
jpekkila 1d315732e0 Giving up on 3D decomposition with CUDA-aware MPI. The MPI implementation on Puhti seems to be painfully bugged, the device pointers are not tracked properly in some cases (f.ex. if there's an array of structures which contain CUDA pointers). Going to implement 3D decomp the traditional way for now (communicating via the CPU). It's easy to switch to CUDA-aware MPI once Mellanox/NVIDIA/CSC have fixed their software. 2020-01-07 21:06:27 +02:00
jpekkila 299ff5cb67 All fields are now packed to simplify communication 2020-01-07 21:01:22 +02:00
jpekkila 5d60791f13 Current 3D decomp method still too complicated. Starting again from scratch. 2020-01-07 14:40:51 +02:00
jpekkila eaee81bf06 Merge branch 'master' into 3d-decomposition-2020-01 2020-01-07 14:25:06 +02:00
jpekkila f0208c66a6 Now compiles also for P100 by default (was removed accidentally in earlier commits) 2020-01-07 10:29:44 +00:00
jpekkila 1dbcc469fc Allocations for packed data (MPI) 2020-01-05 18:57:14 +02:00
jpekkila bee930b151 Merge branch 'master' into 3d-decomposition-2020-01 2020-01-05 16:48:26 +02:00
jpekkila be7946c2af Added the multiplication operator for int3 structures 2020-01-05 16:47:28 +02:00
jpekkila 51b48a5a36 Some intermediate MPI changes 2020-01-05 16:46:40 +02:00
jpekkila d6c81c89fb This 3D blocking approach is getting too complicated, removed code and trying again 2019-12-28 16:38:10 +02:00
jpekkila e86b082c98 MPI transfer for the first corner with 3D blocking now complete. Disabled/enabled some error checking for development 2019-12-27 13:43:22 +02:00
jpekkila bd0cc3ee20 There was some kind of mismatch between CUDA and MPI (UCX) libraries when linking with cudart. Switching to provided by cmake fixed the issue. 2019-12-27 13:41:18 +02:00
jpekkila 6b5910f7df Added allocations for the packed buffers 2019-12-21 19:00:35 +02:00
jpekkila 57a1f3e30c Added a generic pack/unpack function 2019-12-21 16:20:40 +02:00
jpekkila e4f7214b3a benchmark.cc edited online with Bitbucket 2019-12-21 11:26:54 +00:00
jpekkila 3ecd47fe8b Merge branch 'master' into 3d-decomposition-2020-01 2019-12-21 13:22:45 +02:00
jpekkila 35b56029cf Build failed with single-precision, added the correct casts to modelsolver.c 2019-12-21 13:21:56 +02:00
jpekkila 4d873caf38 Changed utils CMakeList.txt to modern cmake style 2019-12-21 13:16:08 +02:00
jpekkila bad64f5307 Started the 3D decomposition branch. Four tasks: 1) Determine how to distribute the work given n processes 2) Distribute and gather the mesh to/from these processes 3) Create packing/unpacking functions and 4) Transfer packed data blocks between neighbors. Tasks 1 and 2 done with this commit. 2019-12-21 12:37:01 +02:00
jpekkila ecff5c3041 Added some final changes to benchmarking 2019-12-15 21:47:41 +02:00
jpekkila 8bd81db63c Added CPU parallelization to make CPU integration and boundconds faster 2019-12-14 15:45:42 +02:00
jpekkila ff35d78509 Rewrote the MPI benchmark-verification function 2019-12-14 15:26:19 +02:00
jpekkila f0e77181df Benchmark finetuning 2019-12-14 14:52:06 +02:00
jpekkila b8a997b0ab Added code for doing a proper verification run with MPI. Passes nicely with full MHD + upwinding when using the new utility stuff introduced in the previous commits. Note: forcing is not enabled in the utility library by default. 2019-12-14 07:37:59 +02:00
jpekkila 277905aafb Added a model integrator to the utility library (written in pure C). Requires support for AVX vector instructions. 2019-12-14 07:34:33 +02:00
jpekkila 22a3105068 Finished the latest version of autotesting (utility library). Uses ulps to determine the acceptable error instead of the relative error used previously 2019-12-14 07:27:11 +02:00
jpekkila 5ec2f6ad75 Better wording in config_loader.c 2019-12-14 07:23:25 +02:00
jpekkila 164d11bfca Removed flush-to-zero flags from kernel compilation. No significant effect on performance but may affect accuracy in some cases 2019-12-14 07:22:14 +02:00
jpekkila 6b38ef461a Puhti GPUDirect fails for some reason if the cuda library is linked with instead of cudart 2019-12-11 17:26:21 +02:00
jpekkila a1a2d838ea Merge branch 'master' of https://bitbucket.org/jpekkila/astaroth 2019-12-08 23:22:51 +02:00
jpekkila 752f44b0a7 Second attempt at getting bitbucket to compile 2019-12-08 23:22:33 +02:00
jpekkila 420f8b9e06 MPI benchmark now writes out the 95th percentile instead of average running time 2019-12-08 23:12:23 +02:00
jpekkila 90f85069c6 Bitbucket pipelines building fails because the CUDA include dir does not seem to be included for some reason. This is an attempted fix 2019-12-08 23:08:45 +02:00
jpekkila 2ab605e125 Added the default testcase for MPI benchmarks 2019-12-05 18:14:36 +02:00
jpekkila d136834219 Re-enabled and updated MPI integration with the proper synchronization from earlier commits, removed old stuff. Should now work and be ready for benchmarks 2019-12-05 16:48:45 +02:00
jpekkila f16826f2cd Removed old code 2019-12-05 16:40:48 +02:00
jpekkila 9f4742bafe Fixed the UCX warning from the last commit. Indexing of MPI_Waitall was wrong and also UCX required that MPI_Isend is also "waited" even though it should implicitly complete at the same time with MPI_Irecv 2019-12-05 16:40:30 +02:00
jpekkila e47cfad6b5 MPI now compiles and runs on Puhti, basic verification test with boundary transfers OK. Gives an "UCX WARN object 0x2fa7780 was not returned to mpool ucp_requests" warning though which seems to indicate that not all asynchronous MPI calls finished before MPI_Finalize 2019-12-05 16:17:17 +02:00
jpekkila e99a428dec OpenMP is now properly linked with the standalone without propagating it to nvcc (which would cause an error) 2019-12-05 15:30:48 +02:00
jpekkila 9adb9dc38a Disabled MPI integration temporarily and enabled verification for MPI tests 2019-12-04 15:11:40 +02:00
jpekkila 6a250f0572 Rewrote core CMakeLists.txt for cmake versions with proper CUDA & MPI support (3.9+) 2019-12-04 15:09:38 +02:00
jpekkila 0ea2fa9337 Cleaner MPI linking with the core library. Requires cmake 3.9+ though, might have to modify later to work with older versions. 2019-12-04 13:49:38 +02:00
jpekkila 6e63411170 Moved the definition of AC_DEFAULT_CONFIG to the root-level CMakeLists.txt. Now should be visible throughout the project. 2019-12-03 18:42:49 +02:00
jpekkila f97e5cb77c Fixed parts which caused a shadowing warning (same variable name used for different variables in the same scope) 2019-12-03 18:41:08 +02:00
jpekkila 04e27e85b2 Removed MPI from the core library dependencies: instead one should use the appropriate mpi compiler for compiling host code by passing something like -DCMAKE_C_COMPILER=/appl/opt/openmpi/3.1.3-cuda/gcc/7.3.0/bin/mpicc -DCMAKE_CXX_COMPILER=/appl/opt/openmpi/3.1.3-cuda/gcc/7.3.0/bin/mpicxx to cmake 2019-12-03 18:40:15 +02:00
jpekkila c273fcf110 More rigorous error checking 2019-12-03 18:38:15 +02:00
jpekkila 825aa0efaa More warning flags for host code in the core library + small misc changes 2019-12-03 16:58:20 +02:00