jpekkila
|
aa6c2b23d9
|
Built-in parameters are now added during compilation instead of defining them in CUDA sources. IMPORTANT: DCONST macro should no longer be used when accessing built-in variables. Now all uniforms are consistently accessed with the handle only
|
2019-10-07 17:39:27 +03:00 |
|
jpekkila
|
6ed3b7978d
|
Updated the name of the generated header
|
2019-10-07 15:44:21 +03:00 |
|
jpekkila
|
9a16c79ce6
|
Renamed all references to uniforms to f.ex. loadScalarConstant -> loadScalarUniform (for consistency with the DSL)
|
2019-10-01 17:12:20 +03:00 |
|
jpekkila
|
2c8c49ee24
|
Removed or updated some old .gitignore files
|
2019-09-24 17:50:41 +03:00 |
|
jpekkila
|
021e5f3774
|
Renamed NUM_STREAM_TYPES -> NUM_STREAMS
|
2019-09-12 15:48:38 +03:00 |
|
jpekkila
|
53230c9b61
|
Added errorchecking and more flexibility the the new acDeviceLoadScalarArray function
|
2019-09-05 19:56:04 +03:00 |
|
jpekkila
|
263a1d23a3
|
Added a function for loading ScalarArrays to the GPU
|
2019-09-05 16:35:08 +03:00 |
|
jpekkila
|
9e57aba9b7
|
New feature: ScalarArray. ScalarArrays are read-only 1D arrays containing max(mx, max(my, mz)) elements. ScalarArray is a new type of uniform and can be used for storing f.ex. forcing profiles. The DSL now also supports complex numbers and some basic arithmetic (exp, multiplication)
|
2019-09-02 21:26:57 +03:00 |
|
jpekkila
|
022e46f2e7
|
Merge branch 'master' into dsl_parameter_overhaul_2019-08-19
|
2019-08-23 13:13:57 +03:00 |
|
jpekkila
|
f6040f89dc
|
Added acPrintMeshInfo for printing all mesh parameters
|
2019-08-21 16:24:48 +03:00 |
|
jpekkila
|
0208d55e4e
|
Moved STENCIL_ORDER and NGHOST out of user-defined parameter as these are actually internal defines used to configure the built-in functions. Additionally, renamed all explicitly declared uniforms from dsx -> AC_dsx in the DSL in preparation for having clear connection between DSL uniforms and the library parameter handles created by the user (AcRealParam etc)
|
2019-08-19 16:40:47 +03:00 |
|
jpekkila
|
787363226b
|
Added functions for loading int, int3, scalar and vector constants to the device layer (acDeviceLoad...Constant)
|
2019-08-19 15:28:16 +03:00 |
|
jpekkila
|
598799d7c3
|
Added a new function to the device interface: acDeviceLoadMeshInfo
|
2019-08-19 15:14:00 +03:00 |
|
jpekkila
|
3369d8efec
|
Added a missing include
|
2019-08-12 11:44:27 +03:00 |
|
jpekkila
|
bba9ec7c3b
|
Implemented acNodeQueryDeviceConfiguration
|
2019-08-12 11:40:38 +03:00 |
|
jpekkila
|
b5daf22c26
|
Added interface function acSynchronizeMesh
|
2019-08-12 10:25:05 +03:00 |
|
jpekkila
|
fdadd463b7
|
Included the user-defined header after the definition of AcReal to make it available if needed.
|
2019-08-09 17:11:21 +03:00 |
|
jpekkila
|
5397495496
|
Added acLoadWithOffset
|
2019-08-08 20:43:01 +03:00 |
|
jpekkila
|
e79e1207f2
|
Added a function for checking whether CUDA-capable devices are available
|
2019-08-08 20:35:02 +03:00 |
|
jpekkila
|
8a9099d75e
|
Added missing functions to fix backwards compatibility with the version interfaced with Pencil Code
|
2019-08-08 19:49:57 +03:00 |
|
jpekkila
|
322cdce52c
|
Added some new comments + some helpful old comments from a time before the interface revision
|
2019-08-07 20:05:54 +03:00 |
|
jpekkila
|
3726847683
|
Made globalGridN and d_multigpu_offsets built-in parameters. Note the renaming from globalGrid.n to globalGridN.
|
2019-08-06 16:39:15 +03:00 |
|
jpekkila
|
b73c2675e8
|
Added the optimized implementation of acNodeIntegrate where boundconds are done before integration instead of after
|
2019-08-05 20:10:13 +03:00 |
|
jpekkila
|
5232d987c1
|
Added acStoreWithOffset to the revised interface
|
2019-08-05 16:18:22 +03:00 |
|
jpekkila
|
5f2378e91b
|
Now compiles (does not work though)
|
2019-08-02 15:15:18 +03:00 |
|
jpekkila
|
567ad61465
|
Multinode MPI implementation should be done later in its own branch. The focus of this branch is to revise the node and device layers. Commented out references to the Grid layer.
|
2019-08-02 13:54:54 +03:00 |
|
jpekkila
|
2b6bf10ae6
|
Dummy implementation of the Grid interface
|
2019-08-01 18:37:36 +03:00 |
|
jpekkila
|
328b809efe
|
Added the revised node interface
|
2019-08-01 14:04:11 +03:00 |
|
jpekkila
|
fb0610c1ba
|
Intermediate changes to the revised node interface
|
2019-07-31 20:04:39 +03:00 |
|
jpekkila
|
49026bd26b
|
Revised device interface done
|
2019-07-31 18:46:41 +03:00 |
|
jpekkila
|
5be775dbff
|
Various intermediate changes
|
2019-07-31 17:48:48 +03:00 |
|
jpekkila
|
efd9d54fef
|
Stashing WIP changes (interface revision) s.t. I can continue work on a different machine
|
2019-07-30 14:34:44 +03:00 |
|
jpekkila
|
1ceb6739ae
|
Merge branch 'master' into node_device_interface_revision_07-23
|
2019-07-30 14:31:33 +03:00 |
|
jpekkila
|
69deef66fe
|
Added sum reduction. NOTE: Scalar sum does not pass the automated test but vector sum does. I couldn't see anything wrong with the code itself and I strongly suspect that the failures are caused by loss of precision due to summing a huge amount of numbers of different magnitudes. However I'm not yet completely sure. Something like the Kahan summation algorithm might be useful if the errors are really caused by fp arithmetic.
|
2019-07-30 14:28:18 +03:00 |
|
jpekkila
|
f322bc8b37
|
Rewrote all CMakeLists. Now much cleaner and there's a clear separation during compilation between the core and standalone modules.
|
2019-07-23 20:50:37 +03:00 |
|
jpekkila
|
b65454d523
|
Stashed some testing files used to make sure that the library can also be used from pure C projects (better compatibility). These changes will never go to master as-is.
|
2019-07-23 18:24:47 +03:00 |
|
jpekkila
|
0282f45077
|
Forgot extern C
|
2019-07-23 16:11:17 +03:00 |
|
jpekkila
|
e5172e2a9a
|
Moved more stuff out of astaroth.h to astaroth_defines.h. I'm not particularly sure what's the best way to arrange the include files. These changes are just for readability so it's very safe to move things around though.
|
2019-07-23 16:06:54 +03:00 |
|
jpekkila
|
c98e730397
|
Added extern C to the include headers
|
2019-07-23 15:02:54 +03:00 |
|
jpekkila
|
c0774bc3b8
|
Added overloads for getting and setting various parameters. However, the compiler mangles the names which is not good for a cross-platform library so the functions are commented out for now. Sadly _Generic, which would solve everything, from C11 is not available in C++.
|
2019-07-23 14:56:41 +03:00 |
|
jpekkila
|
97d5b2e04a
|
Formatting
|
2019-07-23 14:39:36 +03:00 |
|
jpekkila
|
323d4e3b31
|
Replaced all calls to AC_VTXBUF_IDX to acVertexBufferIdx etc in all files
|
2019-07-23 14:37:28 +03:00 |
|
jpekkila
|
27f4d1e4ff
|
Added actual functions for getting size of the vertex buffers etc. The previously used macros are now deprecated. Type safety is the major benefit of using functions instead of definitions.
|
2019-07-23 13:44:43 +03:00 |
|
jpekkila
|
f74df5339f
|
Cleaned up the include directory: removed all unnecessary stuff and moved common definitions to a separate file
|
2019-07-22 19:46:45 +03:00 |
|
jpekkila
|
78aba6428e
|
Updated the copyright years throughout the project
|
2019-07-16 14:28:32 +03:00 |
|
jpekkila
|
93fc121f5c
|
Introduced versions of the asynchronous functions which take a stream as a parameter
|
2019-07-10 15:49:21 +03:00 |
|
jpekkila
|
866ec8a192
|
Removed some old hack I used for benchmarking a while back
|
2019-07-10 14:34:05 +03:00 |
|
jpekkila
|
0bda016e17
|
Reviewed the Astaroth interface. Now there's a clear distinction between synchronous and asynchronous functions. For basic usage, we provide a set of functions that are always safe to call (acIntegrate, acLoad, etc), but because of this, must be quite restricted in the sense that f.ex. the whole mesh must be loaded at once and computations cannot be executed concurrently on multiple GPUs. For more advanced users we provide asynchronous functions (such as acLoadWithOffset). Since we cannot know how the asynchronous functions are called (for example, when the integration step has been fully completed and the halos of neighboring subgrids can be safely communicated between GPUs), the responsibility of synchronization must be left to the user. In the existing implementations we currently use only the basic "safe" set of functions (except in renderer.cc), so the existing functionality has not been changed with these latests commits. Autotests also pass.
|
2019-07-09 18:42:00 +03:00 |
|
jpekkila
|
10a98b01a9
|
Experimental change: now the integration function is automatically optimized during acInit
|
2019-07-09 14:46:24 +03:00 |
|
jpekkila
|
0884c4bf38
|
Moved the definition of acForcingVec to host_forcing.cc since it depends on user parameters that may not be defined in all projects
|
2019-07-04 15:28:18 +03:00 |
|