Commit Graph

150 Commits

Author SHA1 Message Date
jpekkila
53230c9b61 Added errorchecking and more flexibility the the new acDeviceLoadScalarArray function 2019-09-05 19:56:04 +03:00
jpekkila
263a1d23a3 Added a function for loading ScalarArrays to the GPU 2019-09-05 16:35:08 +03:00
jpekkila
9e57aba9b7 New feature: ScalarArray. ScalarArrays are read-only 1D arrays containing max(mx, max(my, mz)) elements. ScalarArray is a new type of uniform and can be used for storing f.ex. forcing profiles. The DSL now also supports complex numbers and some basic arithmetic (exp, multiplication) 2019-09-02 21:26:57 +03:00
jpekkila
6ea02fa28e DSL now 'feature complete' with respect to what I had in mind before the summer. Users can now create multiple kernels and the library functions are generated automatically for them. The generated library functions are of the form acDeviceKernel_<name> and acNodeKernel_<name>. More features are needed though. The next features to be added at some point are 1D and 2D device constant arrays in order to support profiles for f.ex. forcing. 2019-08-27 18:19:20 +03:00
jpekkila
20138263f4 The previous attempt (dsl_feature_completeness_2019-08-23) to enable arbitrary kernel functions was a failure: we get significant performance loss (25-100%) if step_number is not passed as a template parameter to the integration kernel. Apparently the CUDA compiler cannot perform some optimizations if there is a if/else construct in a performance-critical part which cannot be evaluated at compile time. This branch keeps step_number as a template parameter but takes rest of the user parameters as uniforms (dt is no longer passed as a function parameter but as an uniform with the DSL instead). 2019-08-27 17:36:33 +03:00
jpekkila
022e46f2e7 Merge branch 'master' into dsl_parameter_overhaul_2019-08-19 2019-08-23 13:13:57 +03:00
jpekkila
f6040f89dc Added acPrintMeshInfo for printing all mesh parameters 2019-08-21 16:24:48 +03:00
jpekkila
39dcda4a04 Made warnings about unused functions go away (this is intended functionality and not all programs will use all types of device constants, thus unnecessary warning) 2019-08-21 14:28:46 +03:00
jpekkila
51cf1f1068 The C header is now generated from the DSL, stashing the changes just to be sure since I might overwrite something when updating the compilation scripts to work with this new scheme 2019-08-19 18:19:28 +03:00
jpekkila
d801ebdd41 Now parameters and vertexbuffers (fields) can be declared with the DSL only. TODO: translation from the DSL header to C 2019-08-19 17:35:03 +03:00
jpekkila
bcdd827a4f Added a proper declarations for all user-specified uniform. Note: built-in uniforms are not correctly translated into CUDA 2019-08-19 17:05:56 +03:00
jpekkila
0208d55e4e Moved STENCIL_ORDER and NGHOST out of user-defined parameter as these are actually internal defines used to configure the built-in functions. Additionally, renamed all explicitly declared uniforms from dsx -> AC_dsx in the DSL in preparation for having clear connection between DSL uniforms and the library parameter handles created by the user (AcRealParam etc) 2019-08-19 16:40:47 +03:00
jpekkila
787363226b Added functions for loading int, int3, scalar and vector constants to the device layer (acDeviceLoad...Constant) 2019-08-19 15:28:16 +03:00
jpekkila
41805dcb68 Added some error checking for the case where user supplies an incomplete meshinfo to acDeviceLoadMeshInfo 2019-08-19 15:17:51 +03:00
jpekkila
598799d7c3 Added a new function to the device interface: acDeviceLoadMeshInfo 2019-08-19 15:14:00 +03:00
jpekkila
e89897985e Battled with math.h and cmath. We probably should move from C standard libraries to C++ ones internally (in places which are not visible via the interface) 2019-08-19 14:02:30 +03:00
jpekkila
6d4d53342e Removed old comments 2019-08-15 11:14:52 +03:00
jpekkila
36fea70560 Moved basic built-in functions for vector operations to math_utils.h from integration.cuh so that they are shared with the CPU and GPU 2019-08-15 11:04:22 +03:00
jpekkila
d5b2e5bb42 Added placeholders for new built-in variables in the DSL. Also overloads to DCONST_INT etc. Naming still pending and old DCONST_REAL etc calls still work. 2019-08-12 14:05:35 +03:00
jpekkila
b8c4d07de2 Removed unnecessary comments 2019-08-12 13:31:24 +03:00
jpekkila
e027f7e548 Removed grid_n in astaroth.cu and replaced it with the new acNodeQueryDeviceConfiguration call 2019-08-12 13:25:47 +03:00
jpekkila
bba9ec7c3b Implemented acNodeQueryDeviceConfiguration 2019-08-12 11:40:38 +03:00
jpekkila
b5daf22c26 Added interface function acSynchronizeMesh 2019-08-12 10:25:05 +03:00
jpekkila
8bbb2cd5df Now prints device info before trying to run the dummy kernel 2019-08-12 09:46:37 +03:00
jpekkila
b53cabbc44 Made the DSL syntax less confusing: Input and output arrays are now ScalarField and VectorFields instead of scalars and vectors. C++ initializers are now also possible, removing the need to declare Fields as int or int3 which was very confusing, like "what, you assing an int value to a real, what the &^%@?" 2019-08-08 21:07:36 +03:00
jpekkila
5397495496 Added acLoadWithOffset 2019-08-08 20:43:01 +03:00
jpekkila
e79e1207f2 Added a function for checking whether CUDA-capable devices are available 2019-08-08 20:35:02 +03:00
jpekkila
8a9099d75e Added missing functions to fix backwards compatibility with the version interfaced with Pencil Code 2019-08-08 19:49:57 +03:00
jpekkila
322cdce52c Added some new comments + some helpful old comments from a time before the interface revision 2019-08-07 20:05:54 +03:00
jpekkila
1525e0603f Added some preliminary pragma omps and verified that acIntegrate works as it should. 2019-08-07 19:08:52 +03:00
jpekkila
c2bd5ae3e6 Simplified the optimized multi-GPU integration function 2019-08-07 18:17:03 +03:00
jpekkila
a930864f42 Merge branch 'master' into node_device_interface_revision_07-23 2019-08-07 07:43:28 +03:00
jpekkila
cf6b75f82a Merged in cmakelist_rewrite_and_C_API_conformity_07-26 (pull request #1) 2019-08-07 06:53:17 +03:00
jpekkila
6b53eb31ef Errors with forcing now down from 3 to 1 after switching from fast & inaccurate trig functions to more accurate ones 2019-08-06 19:29:40 +03:00
jpekkila
daee456660 Merge branch 'cmakelist_rewrite_and_C_API_conformity_07-26' into node_device_interface_revision_07-23 2019-08-06 17:57:30 +03:00
jpekkila
abf4815174 Merge branch 'master' into cmakelist_rewrite_and_C_API_conformity_07-26 2019-08-06 17:53:53 +03:00
jpekkila
5870081645 Split kernels.cuh into bounconds.cuh, integration.cuh and reductions.cuh 2019-08-06 17:50:41 +03:00
jpekkila
405fa4d6d6 Moved old kernels to kernels/deprecated 2019-08-06 17:46:52 +03:00
jpekkila
3726847683 Made globalGridN and d_multigpu_offsets built-in parameters. Note the renaming from globalGrid.n to globalGridN. 2019-08-06 16:39:15 +03:00
jpekkila
1dd9975528 Formatting 2019-08-06 15:44:51 +03:00
jpekkila
b2632c87b4 Merge branch 'cmakelist_rewrite_and_C_API_conformity_07-26' into node_device_interface_revision_07-23 2019-08-06 15:18:33 +03:00
jpekkila
280804a438 Merge branch 'master' into cmakelist_rewrite_and_C_API_conformity_07-26 2019-08-06 15:14:33 +03:00
jpekkila
5f4246fb42 Standalone now uses O2 optimization level instead of O3. Also removed -march=native since this causes issues if the program is compiled on a different architecture than it is run on. Since we do not do heavy arithmetic on the host side and the host code is not performance-critical part of the code, -march-native is not very useful anyways 2019-08-06 14:46:13 +03:00
jpekkila
b73c2675e8 Added the optimized implementation of acNodeIntegrate where boundconds are done before integration instead of after 2019-08-05 20:10:13 +03:00
jpekkila
8df49370c8 Cleanup 2019-08-05 19:08:05 +03:00
jpekkila
fa6e1116cb The interface revision now actually works. The issue was incorrect order of src and dst indices when storing the mesh. 2019-08-05 17:26:05 +03:00
jpekkila
5232d987c1 Added acStoreWithOffset to the revised interface 2019-08-05 16:18:22 +03:00
jpekkila
f3de2fa03c Made globalVertexIdx available during preprocessing. NOTE: potentially dangerous. globalVertexIdx should never be used for reading data from the vertex buffers. 2019-08-05 15:03:02 +03:00
jpekkila
6dfd03664d Still does not work. I'm starting to think that instead of this one huge revision, we should modify the existing interface step-by-step. 2019-08-02 15:31:24 +03:00
jpekkila
5f2378e91b Now compiles (does not work though) 2019-08-02 15:15:18 +03:00