astaroth

Author	SHA1	Message	Date
jpekkila	5232d987c1	Added acStoreWithOffset to the revised interface	2019-08-05 16:18:22 +03:00
jpekkila	567ad61465	Multinode MPI implementation should be done later in its own branch. The focus of this branch is to revise the node and device layers. Commented out references to the Grid layer.	2019-08-02 13:54:54 +03:00
jpekkila	2b6bf10ae6	Dummy implementation of the Grid interface	2019-08-01 18:37:36 +03:00
jpekkila	5be775dbff	Various intermediate changes	2019-07-31 17:48:48 +03:00
jpekkila	efd9d54fef	Stashing WIP changes (interface revision) s.t. I can continue work on a different machine	2019-07-30 14:34:44 +03:00
jpekkila	1ceb6739ae	Merge branch 'master' into node_device_interface_revision_07-23	2019-07-30 14:31:33 +03:00
jpekkila	69deef66fe	Added sum reduction. NOTE: Scalar sum does not pass the automated test but vector sum does. I couldn't see anything wrong with the code itself and I strongly suspect that the failures are caused by loss of precision due to summing a huge amount of numbers of different magnitudes. However I'm not yet completely sure. Something like the Kahan summation algorithm might be useful if the errors are really caused by fp arithmetic.	2019-07-30 14:28:18 +03:00
jpekkila	b65454d523	Stashed some testing files used to make sure that the library can also be used from pure C projects (better compatibility). These changes will never go to master as-is.	2019-07-23 18:24:47 +03:00
jpekkila	323d4e3b31	Replaced all calls to AC_VTXBUF_IDX to acVertexBufferIdx etc in all files	2019-07-23 14:37:28 +03:00
jpekkila	f74df5339f	Cleaned up the include directory: removed all unnecessary stuff and moved common definitions to a separate file	2019-07-22 19:46:45 +03:00
jpekkila	168b3c4d8b	Peer access to neighboring GPUs is now enabled during initialization	2019-07-22 13:02:19 +03:00
jpekkila	78aba6428e	Updated the copyright years throughout the project	2019-07-16 14:28:32 +03:00
jpekkila	93fc121f5c	Introduced versions of the asynchronous functions which take a stream as a parameter	2019-07-10 15:49:21 +03:00
jpekkila	bd98eaf9f7	Added a stream to loadDeviceConstant call.	2019-07-10 15:29:54 +03:00
jpekkila	0bda016e17	Reviewed the Astaroth interface. Now there's a clear distinction between synchronous and asynchronous functions. For basic usage, we provide a set of functions that are always safe to call (acIntegrate, acLoad, etc), but because of this, must be quite restricted in the sense that f.ex. the whole mesh must be loaded at once and computations cannot be executed concurrently on multiple GPUs. For more advanced users we provide asynchronous functions (such as acLoadWithOffset). Since we cannot know how the asynchronous functions are called (for example, when the integration step has been fully completed and the halos of neighboring subgrids can be safely communicated between GPUs), the responsibility of synchronization must be left to the user. In the existing implementations we currently use only the basic "safe" set of functions (except in renderer.cc), so the existing functionality has not been changed with these latests commits. Autotests also pass.	2019-07-09 18:42:00 +03:00
jpekkila	1251f61570	Removed a stray acBoundcondStep() in acStore where it definitely shouln't be. Removed code duplication: acBoundcondStep now uses the new acLocalBoundcondStep and acGlobalBoundcondStep functions.	2019-07-09 17:08:18 +03:00
jpekkila	a086821e7c	Added a function acAutoOptimize to the interface and removed rk3_step_async in kernels.cuh (moved into rkStep)	2019-07-09 14:21:22 +03:00
jpekkila	5fdfdeca9e	Multi-GPU optimizations: removed some unnecessary synchronization and divided the calculation of boundary conditions to local and global steps.	2019-07-05 18:21:44 +03:00
jpekkila	f1066a2c11	Added preliminary pragmas for dispatching commands simultaneously to multiple GPUs (commented out)	2019-07-05 17:16:12 +03:00
jpekkila	2092adc0f6	Preparations for multi-GPU optimizations	2019-07-05 15:44:30 +03:00
jpekkila	ce8fe53f91	Moved explanations and comments to the beginning of astaroth.cu. No code changes.	2019-07-05 15:39:52 +03:00
jpekkila	224b91b83a	Added more control for synchronizing streams and halos among the GPUs	2019-07-05 15:17:20 +03:00
jpekkila	332f1a4f40	Reordered some of the functions in astaroth.cu and introduced acExchangeHalos() for synchronizing the part of the grid that is independent from the chosen boundary conditions between subgrids.	2019-07-05 15:01:51 +03:00
jpekkila	d1a93b7d4e	acIntegrateStepWithOffset corrected and confirmed to work on 1-4 GPUs	2019-07-04 16:58:24 +03:00
jpekkila	01437411b6	Comment	2019-07-04 16:39:20 +03:00
jpekkila	91f119e8dd	Deprecated the old implementation of acIntegrateStep. acIntegrateStep now calls acIntegrateStepWithOffset instead of device.cuh functions.	2019-07-04 16:37:55 +03:00
jpekkila	5049dadc1c	Implemented acIntegrateStepWithOffset	2019-07-04 16:31:16 +03:00
jpekkila	a53e0a170d	Overloaded max/min for int3 and removed old comments	2019-07-04 16:24:08 +03:00
jpekkila	e1d545b0eb	Code readability and cleanup (remembered that int3 has + and - operators defined in math_utils.h)	2019-07-04 16:16:49 +03:00
jpekkila	30254d9abb	Removed a redundant and old gridIdxx function which I though I already removed a long time ago.	2019-07-04 16:10:29 +03:00
jpekkila	0884c4bf38	Moved the definition of acForcingVec to host_forcing.cc since it depends on user parameters that may not be defined in all projects	2019-07-04 15:28:18 +03:00
jpekkila	7abb959828	Overhaul to the user-defined parameters done: All logical switches, parameters and vertex buffer handles are now defined in a single header file (the default location is acc/mhd_solver/stencil_defines.h). This header is used when preprocessing the DSL sources and is linked to the include/ directory when calling scripts/compile_acc.sh. astaroth.h is now used for configuring internal stuff only and should not be modified by users	2019-07-03 19:01:16 +03:00
jpekkila	08e9a32cb1	Added a comment about acForcingVec	2019-07-03 16:37:16 +03:00
jpekkila	d4d2680f40	Added a new generic function to the interface (astaroth.h) for loading arbitrary device constants. Also (unintended) autoformatting.	2019-07-03 16:19:25 +03:00
Miikka Vaisala	03689709df	Merge branch 'master' into forcing	2019-07-02 16:43:10 +08:00
Miikka Vaisala	9f0be0d9ff	Solved the forcing function boundary problem.	2019-07-01 11:06:42 +08:00
jpekkila	7e40889245	Grid and subgrid dimensions are now only printed if VERBOSE_PRINTING == 1	2019-06-27 12:54:36 +03:00
Miikka Vaisala	d30b866a21	Merge branch 'master' into forcing Now I need to test what works... Conflicts: acc/mhd_solver/stencil_process.sps	2019-06-27 11:22:31 +08:00
jpekkila	401172bb74	Formatting	2019-06-26 19:43:37 +03:00
jpekkila	ee075e6741	Set the default number of devices to 0 (this is updated at acInit()	2019-06-26 19:42:49 +03:00
jpekkila	cda17c9b08	VERBOSE_PRINTING flag is now globally used in the whole program and should be used to suppress development/debugging-related printing. Also added comments to the new interface function acCheckDeviceAvailability and made it free from side effects.	2019-06-26 18:50:15 +03:00
Matthias Rheinhardt	0bc8b7e827	MR: VTXBUF_DENSITY -> VTXBUF_LNRHO, minor	2019-06-26 17:14:24 +03:00
Matthias Rheinhardt	522da0041f	MR: new name for GetDevice	2019-06-26 16:53:56 +03:00
Miikka Vaisala	be0e46c814	Can move forcing vector information now from the host to device. next step in to generate random waves in the CPU with a chosen degree of helicity etc.	2019-06-26 17:41:39 +08:00
Miikka Vaisala	231a8aa06e	Trying to figure out how to upload values to GPU.	2019-06-26 15:23:46 +08:00
jpekkila	2310186c71	Added a skeleton function for updating an arbitrary block inside the computational domain instead of the whole mesh	2019-06-19 19:43:46 +03:00
jpekkila	2eacb98246	Now acBoundcondStep is applied after acIntegrate to ensure that the whole grid visible to the host, including boundaries, are always up to date	2019-06-19 14:29:07 +03:00
jpekkila	8864266042	Autoformatted all CUDA/C/C++ code	2019-06-18 16:42:56 +03:00
jpekkila	4ca4dbefdf	Added the machinery for implementing forcing with the DSL on multiple GPUs and a simple model solution	2019-06-18 16:13:32 +03:00
jpekkila	59086b3e79	Added multi-GPU reductions. Tested to work with 1-2 GPUs with power of two grid dimensions. Requires more testing in special cases (when using exotic grid dimensions and a large number of GPUs)	2019-06-17 14:45:41 +03:00

1 2

51 Commits