Commit Graph

24 Commits

Author SHA1 Message Date
jpekkila
f7bd84af46 Added macros for getting int3 and AcReal3 device constants from within kernels (and DSL). 2019-07-31 17:07:02 +08:00
jpekkila
323d4e3b31 Replaced all calls to AC_VTXBUF_IDX to acVertexBufferIdx etc in all files 2019-07-23 14:37:28 +03:00
jpekkila
fee03b7149 Moved some device limits used only during auto-optimization from astaroth.h to device.cu 2019-07-22 19:54:46 +03:00
jpekkila
a950be99f2 Streams now created with priority (all streams have the same priority by default) 2019-07-22 13:04:04 +03:00
jpekkila
78aba6428e Updated the copyright years throughout the project 2019-07-16 14:28:32 +03:00
jpekkila
b08d5b26f5 cudaMemcpyToSymbol -> cudaMemcpyToSymbolAsync 2019-07-10 15:05:57 +03:00
jpekkila
976bf05c8d Wrong scope for num_iterations in the last commit, fixed 2019-07-10 14:37:32 +03:00
jpekkila
866ec8a192 Removed some old hack I used for benchmarking a while back 2019-07-10 14:34:05 +03:00
jpekkila
d0b95c39b6 Disabled writing out unnecessary files when auto-optimizing the code 2019-07-09 18:51:04 +03:00
jpekkila
10a98b01a9 Experimental change: now the integration function is automatically optimized during acInit 2019-07-09 14:46:24 +03:00
jpekkila
a086821e7c Added a function acAutoOptimize to the interface and removed rk3_step_async in kernels.cuh (moved into rkStep) 2019-07-09 14:21:22 +03:00
jpekkila
deebe570da Merge branch 'master' into multigpu_optimization_2019-07-05 2019-07-08 16:11:24 +03:00
Miikka Vaisala
6ba15c3a7c props.totalConstMem and props.sharedMemPerBlock cause assembler error
while compiling on TIARA gp cluster. Therefore commeted out.
2019-07-08 11:00:12 +08:00
jpekkila
5fdfdeca9e Multi-GPU optimizations: removed some unnecessary synchronization and divided the calculation of boundary conditions to local and global steps. 2019-07-05 18:21:44 +03:00
jpekkila
d87eb36f5a Formatting: brackets around a for loop for consistency 2019-07-05 15:26:19 +03:00
jpekkila
a3ca6cf132 Added skeletons for packing parts of the ghost zones into buffers to speed up data transfers 2019-07-01 13:56:05 +03:00
jpekkila
8864266042 Autoformatted all CUDA/C/C++ code 2019-06-18 16:42:56 +03:00
jpekkila
4ca4dbefdf Added the machinery for implementing forcing with the DSL on multiple GPUs and a simple model solution 2019-06-18 16:13:32 +03:00
jpekkila
57e2e48fb0 Added functions for loading device constants. Also introduced a new int3 constant that can be used to determine the global vertex index inside kernels 2019-06-18 14:11:55 +03:00
jpekkila
c9f26d6e58 Cleanup 2019-06-17 20:44:37 +03:00
jpekkila
ce6f453bc5 Rewrote reductions, now much simpler than before 2019-06-17 20:38:28 +03:00
jpekkila
5e6cc9b8cc Changed names of some parameters to better ones 2019-06-17 18:18:00 +03:00
jpekkila
59086b3e79 Added multi-GPU reductions. Tested to work with 1-2 GPUs with power of two grid dimensions. Requires more testing in special cases (when using exotic grid dimensions and a large number of GPUs) 2019-06-17 14:45:41 +03:00
jpekkila
0e48766a68 Added Astaroth 2.0 2019-06-14 14:19:07 +03:00