Commit Graph

850 Commits

Author SHA1 Message Date
jpekkila
ce81df00e3 Merge branch 'master' of https://bitbucket.org/jpekkila/astaroth 2020-03-25 13:51:07 +02:00
jpekkila
e36ee7e2d6 AC_multigpu_offset tested to work on at least 2 nodes and 8 GPUs. Forcing should now work with MPI 2020-03-25 13:51:00 +02:00
jpekkila
0254628016 Updated API specification. The DSL syntax allows only C++-style casting. 2020-03-25 11:28:30 +00:00
jpekkila
672137f7f1 WIP further MPI optimizations 2020-03-24 19:02:58 +02:00
jpekkila
ef63813679 Explicit check that critical parameters like inv_dsx are properly initialized before calling integration 2020-03-24 17:01:24 +02:00
jpekkila
8c362b44f0 Added more warning in case some of the model solver parameters are not initialized 2020-03-24 16:56:30 +02:00
jpekkila
d520835c42 Added integration to MPI comm, now completes a full integration step. Works at least on 2 nodes 2020-03-24 16:55:38 +02:00
jpekkila
37d6ad18d3 Fixed formatting in the API specification file 2020-03-04 15:09:23 +02:00
jpekkila
13b9b39c0d Renamed sink_particle.md to .txt to avoid it showing up in the documentation 2020-02-28 14:44:51 +02:00
jpekkila
daa895d2fc Fixed an issue that prevented Ninja being used as an alternative build system to Make. There's no signifant performance benefit to using Ninja though. Build times: 29-32 s (Make) and 27-28 s (Ninja) 2020-02-10 14:37:48 +02:00
jpekkila
7b39a6bb1d AC_multigpu_offset is now calculated with MPI. Should now work with forcing, but not tested 2020-02-03 15:45:23 +02:00
jpekkila
50af620a7b More accurate timing when benchmarking MPI. Also made GPU-GPU communication the default. Current version of UCX is bugged, must export 'UCX_MEMTYPE_CACHE=n' to workaround memory errors when doing GPU-GPU comm 2020-02-03 15:27:36 +02:00
jpekkila
459d39a411 README.md edited online with Bitbucket 2020-01-28 17:10:52 +00:00
jpekkila
ade8b10e8f bitbucket-pipelines.yml edited online with Bitbucket. Removed an unnecessary compiler flag. 2020-01-28 17:09:35 +00:00
jpekkila
17c935ce19 Added padding to param name buffers to make them have NUM_*_PARAMS+1 elements. This should satisfy some strict compilation checks. 2020-01-28 18:53:09 +02:00
jpekkila
89f4d08b6c Fixed a possible out-of-bounds access in error checking when NUM_*_PARAMS is 0 2020-01-28 18:43:03 +02:00
jpekkila
7685d8a830 Astaroth 2.2 update complete. 2020-01-28 18:28:38 +02:00
jpekkila
67f2fcc88d Setting inv_dsx etc explicitly is no longer required as they are set to default values in acc/stdlib/stdderiv.h 2020-01-28 18:22:27 +02:00
jpekkila
0ccd4e3dbc Major improvement: uniforms can now be set to default values. The syntax is the same as for setting any other values, f.ex. 'uniform Scalar a = 1; uniform Scalar b = 0.5 * a;'. Undefined uniforms are still allowed, but in this case the user should load a proper value into it during runtime. Default uniform values can be overwritten by calling any of the uniform loader funcions (like acDeviceLoadScalarUniform). Improved also error checking. Now there are explicit warnings if the user tries to load an invalid value into a device constant. 2020-01-28 18:17:31 +02:00
jpekkila
6dfe3ed4d6 Added out-of-the-box support for MPI (though not enabled by default). Previously the user had to pass mpicxx explicitly as the cmake compiler in order to compile MPI code, but this was bad practice and it's better to let cmake handle the include and compilation flags. 2020-01-28 15:59:20 +02:00
jpekkila
85d4de24e3 Recompilation is now properly triggered when acc sources or the ac standard library are modified 2020-01-28 14:12:25 +02:00
jpekkila
07dd9ff024 Updated documentation with the changes made for Astaroth 2.2 2020-01-28 13:36:51 +02:00
jpekkila
5444c84cff Formatting 2020-01-27 18:24:46 +02:00
jpekkila
4c9523675c Might as well enable the C11 standard (separate from CXX11) 2020-01-27 18:16:02 +02:00
jpekkila
8464c1207d Set host compiler CXX standard explicitly to 11 2020-01-27 18:14:29 +02:00
jpekkila
fcd61180c8 Added more information to MULTIGPU_ENABLED cmake flag 2020-01-27 17:19:19 +02:00
jpekkila
9e7e67819f Turned MULTIGPU_ENABLED=ON to be equivalent with the master branch 2020-01-27 17:05:29 +02:00
jpekkila
927d4d31a5 Enabled CXX 11 support for CUDA code (required) 2020-01-27 17:04:52 +02:00
jpekkila
e751ee991b Math operators are now using consistent precision throughout the project 2020-01-27 17:04:14 +02:00
jpekkila
2bc3f9fedd Including when compiling Core seems to be unnecessary since we already include earlier 2020-01-24 07:31:51 +02:00
jpekkila
14ff619ba6 Merge branch 'master' into astaroth_2.2_cleanup 2020-01-24 07:18:30 +02:00
jpekkila
e27be3bdc8 CMakeLists.txt edited online with Bitbucket 2020-01-24 05:15:49 +00:00
jpekkila
2f7e4bf3a2 Enabled MPI compilation test in bitbucket-pipelines.yml. 2020-01-24 05:12:27 +00:00
jpekkila
f8cd571323 Now CMake and compilation flags are functionally equivalent with the current master branch, not taking into account the deprecated flags. Also various small improvements to building.
Deprecated flags:
        * BUILD_DEBUG. This was redundant since CMake also has such flag. The build type can now be switched by passing -DCMAKE_BUILD_TYPE=<Release|Debug|RelWithDebugInfo|...> to cmake. See CMake documentation on CMAKE_BUILD_TYPE on all av
        * BUILD_UTILS. The utility library is now always built along the core library. We can reintroduce this flag if needed when the library grows larger. Currently MPI functions depend on Utils and without the flag we don't have to worr
        * BUILD_RT_VISUALIZATION. RT visualization has been dormant for a while and I'm not even sure if it works any more. Eventually the RT library should be generalized and moved to Utils at some point. Disabled the build flag for the t
2020-01-24 07:00:49 +02:00
jpekkila
c7c2a3eea4 Simplified/rewrote the root CMakeLists.txt s.t. compilation bugs are easier to pinpoint. WIP, not all functionality is yet enabled (primarily compilation options like MPI_ENABLED and others) 2020-01-23 20:07:59 +02:00
jpekkila
a5b5e418d4 Moved all headers used throughout the library to src/common 2020-01-23 20:06:47 +02:00
jpekkila
78fbcc090d Reordered src/core to have better division to host and device code (this is more likely to work when compiling with mpicxx). Disabled separate compilation of CUDA kernels as this complicates compilation and is a source of many cmake/cuda bugs. As a downside, GPU code takes longer to compile. 2020-01-23 20:06:20 +02:00
jpekkila
96389e9da6 Modified standalone includes to function with new astaroth headers 2020-01-23 20:03:25 +02:00
jpekkila
3adb0242a4 src/utils is now a real library. Includable with the astaroth_utils.h header and linkable with libastaroth_utils.a. The purpose of Astaroth Utils is to function as a generic utility library in contrast to Astaroth Standalone which is essentially hardcoded only for MHD. 2020-01-23 20:02:38 +02:00
jpekkila
fdd829b888 Cleaned up samples and removed old unused stuff. Simplified CMake files. 2020-01-23 20:00:19 +02:00
jpekkila
7215e842fc Simplified the include directory. Everything is now in only two headers: astaroth.h and astaroth_utils.h. Removed old and unused stuff. user.h is unused in standalone but might be used with Pencil Code, so left that intact. 2020-01-23 19:59:44 +02:00
jpekkila
ba899211ff Better code quality for ACC 2020-01-23 18:08:06 +02:00
jpekkila
5de163e8d1 Added commented out pragma unrolls to remind how packing could be improved. Though at the moment unrolls actually make the performance much worse, reasons unknown. 2020-01-22 19:27:45 +02:00
jpekkila
41f8e9aebb Removed an old inefficient function for MPI comm 2020-01-22 19:26:33 +02:00
jpekkila
caacf2b33c Removed --restrict flag from CUDA compilation for safety 2020-01-22 19:25:26 +02:00
jpekkila
868bf3ed5e Merge branch 'master' of https://bitbucket.org/jpekkila/astaroth 2020-01-22 15:21:10 +02:00
jpekkila
ba8960cd08 Formatting fixes to documentation 2020-01-20 19:28:00 +02:00
jpekkila
354cf81777 MPI_Request was saved to address pointing to local memory, fixed 2020-01-20 19:15:20 +02:00
jpekkila
54d91e7eeb Removed debug synchronization from packing.cu 2020-01-20 18:58:06 +02:00
jpekkila
993bfc4533 Better concurrency and some simplifications (MPI). 2020-01-20 18:45:24 +02:00