jpekkila
|
196edac46d
|
Added proper casts to modelsolver.c
|
2020-06-24 17:03:54 +03:00 |
|
jpekkila
|
9840b817d0
|
Added the (hopefully final) basic test case used for the benchmarks
|
2020-06-07 21:59:33 +03:00 |
|
jpekkila
|
17a4f31451
|
Added the latest setup used for benchmarks
|
2020-06-04 20:47:03 +03:00 |
|
jpekkila
|
78fbcc090d
|
Reordered src/core to have better division to host and device code (this is more likely to work when compiling with mpicxx). Disabled separate compilation of CUDA kernels as this complicates compilation and is a source of many cmake/cuda bugs. As a downside, GPU code takes longer to compile.
|
2020-01-23 20:06:20 +02:00 |
|
jpekkila
|
5e1500fe97
|
Happy new year! :)
|
2020-01-13 21:38:07 +02:00 |
|
jpekkila
|
316d44b843
|
Fixed an out-of-bounds error with auto-optimization (introduced in the last few commits)
|
2019-12-03 16:04:44 +02:00 |
|
jpekkila
|
5a6a3110df
|
Reformatted
|
2019-12-03 15:14:26 +02:00 |
|
jpekkila
|
f14e35620c
|
Now nvcc is used to compile kernels only. All host code, incl. device.cc, MPI communication and others are now compiled with the host C++ compiler. This should work around an nvcc/MPI bug on Puhti.
|
2019-12-03 15:12:17 +02:00 |
|
jpekkila
|
ab539a98d6
|
Replaced old deprecated instances of DCONST_INT with DCONST
|
2019-11-27 13:48:42 +02:00 |
|
jpekkila
|
08f155cbec
|
Finetuning some error checks
|
2019-10-07 20:40:32 +03:00 |
|
jpekkila
|
66cfcefb34
|
More error checks
|
2019-10-07 17:00:23 +03:00 |
|
jpekkila
|
0e1d1b9fb4
|
Some optimizations for DSL compilation. Also a new feature: Inplace addition and subtraction += and -= are now allowed
|
2019-10-07 16:33:24 +03:00 |
|
jpekkila
|
f7c079be2a
|
Removed everything unnecessary from integration.cuh. Now all derivatives etc are available in a standard library header (acc/stdlib/stdderiv.h)
|
2019-10-07 15:47:33 +03:00 |
|
jpekkila
|
9e57aba9b7
|
New feature: ScalarArray. ScalarArrays are read-only 1D arrays containing max(mx, max(my, mz)) elements. ScalarArray is a new type of uniform and can be used for storing f.ex. forcing profiles. The DSL now also supports complex numbers and some basic arithmetic (exp, multiplication)
|
2019-09-02 21:26:57 +03:00 |
|
jpekkila
|
6ea02fa28e
|
DSL now 'feature complete' with respect to what I had in mind before the summer. Users can now create multiple kernels and the library functions are generated automatically for them. The generated library functions are of the form acDeviceKernel_<name> and acNodeKernel_<name>. More features are needed though. The next features to be added at some point are 1D and 2D device constant arrays in order to support profiles for f.ex. forcing.
|
2019-08-27 18:19:20 +03:00 |
|
jpekkila
|
e89897985e
|
Battled with math.h and cmath. We probably should move from C standard libraries to C++ ones internally (in places which are not visible via the interface)
|
2019-08-19 14:02:30 +03:00 |
|
jpekkila
|
36fea70560
|
Moved basic built-in functions for vector operations to math_utils.h from integration.cuh so that they are shared with the CPU and GPU
|
2019-08-15 11:04:22 +03:00 |
|
jpekkila
|
b53cabbc44
|
Made the DSL syntax less confusing: Input and output arrays are now ScalarField and VectorFields instead of scalars and vectors. C++ initializers are now also possible, removing the need to declare Fields as int or int3 which was very confusing, like "what, you assing an int value to a real, what the &^%@?"
|
2019-08-08 21:07:36 +03:00 |
|
jpekkila
|
6b53eb31ef
|
Errors with forcing now down from 3 to 1 after switching from fast & inaccurate trig functions to more accurate ones
|
2019-08-06 19:29:40 +03:00 |
|
jpekkila
|
5870081645
|
Split kernels.cuh into bounconds.cuh, integration.cuh and reductions.cuh
|
2019-08-06 17:50:41 +03:00 |
|