astaroth

Author	SHA1	Message	Date
jpekkila	cc933a0949	README.md edited online with Bitbucket. Consistent headings and another attempt and linking.	2020-01-13 16:26:06 +00:00
jpekkila	b6451c4b82	Fixed hyperlinks in README.md	2020-01-13 16:22:22 +00:00
jpekkila	74f68d4371	CONTRIBUTING.md created online with Bitbucket	2020-01-13 16:16:55 +00:00
jpekkila	bd640a8ff5	Removed unnecessary linebreaks from README.md.	2020-01-13 15:31:05 +00:00
jpekkila	785230053d	Rewrote README.md	2020-01-13 15:27:24 +00:00
jpekkila	92a6a1bdec	Added more professional run flags to ./ac_run	2020-01-13 15:35:01 +02:00
jpekkila	794e4393c3	Added a new function for the legacy Astaroth layer: acGetNode(). This functions returns a Node, which can be used to access acNode layer functions	2020-01-13 11:33:15 +02:00
jpekkila	1d315732e0	Giving up on 3D decomposition with CUDA-aware MPI. The MPI implementation on Puhti seems to be painfully bugged, the device pointers are not tracked properly in some cases (f.ex. if there's an array of structures which contain CUDA pointers). Going to implement 3D decomp the traditional way for now (communicating via the CPU). It's easy to switch to CUDA-aware MPI once Mellanox/NVIDIA/CSC have fixed their software.	2020-01-07 21:06:27 +02:00
jpekkila	299ff5cb67	All fields are now packed to simplify communication	2020-01-07 21:01:22 +02:00
jpekkila	5d60791f13	Current 3D decomp method still too complicated. Starting again from scratch.	2020-01-07 14:40:51 +02:00
jpekkila	eaee81bf06	Merge branch 'master' into 3d-decomposition-2020-01	2020-01-07 14:25:06 +02:00
jpekkila	f0208c66a6	Now compiles also for P100 by default (was removed accidentally in earlier commits)	2020-01-07 10:29:44 +00:00
jpekkila	1dbcc469fc	Allocations for packed data (MPI)	2020-01-05 18:57:14 +02:00
jpekkila	bee930b151	Merge branch 'master' into 3d-decomposition-2020-01	2020-01-05 16:48:26 +02:00
jpekkila	be7946c2af	Added the multiplication operator for int3 structures	2020-01-05 16:47:28 +02:00
jpekkila	51b48a5a36	Some intermediate MPI changes	2020-01-05 16:46:40 +02:00
jpekkila	d6c81c89fb	This 3D blocking approach is getting too complicated, removed code and trying again	2019-12-28 16:38:10 +02:00
jpekkila	e86b082c98	MPI transfer for the first corner with 3D blocking now complete. Disabled/enabled some error checking for development	2019-12-27 13:43:22 +02:00
jpekkila	bd0cc3ee20	There was some kind of mismatch between CUDA and MPI (UCX) libraries when linking with cudart. Switching to provided by cmake fixed the issue.	2019-12-27 13:41:18 +02:00
jpekkila	6b5910f7df	Added allocations for the packed buffers	2019-12-21 19:00:35 +02:00
jpekkila	57a1f3e30c	Added a generic pack/unpack function	2019-12-21 16:20:40 +02:00
jpekkila	e4f7214b3a	benchmark.cc edited online with Bitbucket	2019-12-21 11:26:54 +00:00
jpekkila	3ecd47fe8b	Merge branch 'master' into 3d-decomposition-2020-01	2019-12-21 13:22:45 +02:00
jpekkila	35b56029cf	Build failed with single-precision, added the correct casts to modelsolver.c	2019-12-21 13:21:56 +02:00
jpekkila	4d873caf38	Changed utils CMakeList.txt to modern cmake style	2019-12-21 13:16:08 +02:00
jpekkila	bad64f5307	Started the 3D decomposition branch. Four tasks: 1) Determine how to distribute the work given n processes 2) Distribute and gather the mesh to/from these processes 3) Create packing/unpacking functions and 4) Transfer packed data blocks between neighbors. Tasks 1 and 2 done with this commit.	2019-12-21 12:37:01 +02:00
jpekkila	ecff5c3041	Added some final changes to benchmarking	2019-12-15 21:47:41 +02:00
jpekkila	8bd81db63c	Added CPU parallelization to make CPU integration and boundconds faster	2019-12-14 15:45:42 +02:00
jpekkila	ff35d78509	Rewrote the MPI benchmark-verification function	2019-12-14 15:26:19 +02:00
jpekkila	f0e77181df	Benchmark finetuning	2019-12-14 14:52:06 +02:00
jpekkila	b8a997b0ab	Added code for doing a proper verification run with MPI. Passes nicely with full MHD + upwinding when using the new utility stuff introduced in the previous commits. Note: forcing is not enabled in the utility library by default.	2019-12-14 07:37:59 +02:00
jpekkila	277905aafb	Added a model integrator to the utility library (written in pure C). Requires support for AVX vector instructions.	2019-12-14 07:34:33 +02:00
jpekkila	22a3105068	Finished the latest version of autotesting (utility library). Uses ulps to determine the acceptable error instead of the relative error used previously	2019-12-14 07:27:11 +02:00
jpekkila	5ec2f6ad75	Better wording in config_loader.c	2019-12-14 07:23:25 +02:00
jpekkila	164d11bfca	Removed flush-to-zero flags from kernel compilation. No significant effect on performance but may affect accuracy in some cases	2019-12-14 07:22:14 +02:00
jpekkila	6b38ef461a	Puhti GPUDirect fails for some reason if the cuda library is linked with instead of cudart	2019-12-11 17:26:21 +02:00
jpekkila	a1a2d838ea	Merge branch 'master' of https://bitbucket.org/jpekkila/astaroth	2019-12-08 23:22:51 +02:00
jpekkila	752f44b0a7	Second attempt at getting bitbucket to compile	2019-12-08 23:22:33 +02:00
jpekkila	420f8b9e06	MPI benchmark now writes out the 95th percentile instead of average running time	2019-12-08 23:12:23 +02:00
jpekkila	90f85069c6	Bitbucket pipelines building fails because the CUDA include dir does not seem to be included for some reason. This is an attempted fix	2019-12-08 23:08:45 +02:00
jpekkila	2ab605e125	Added the default testcase for MPI benchmarks	2019-12-05 18:14:36 +02:00
jpekkila	d136834219	Re-enabled and updated MPI integration with the proper synchronization from earlier commits, removed old stuff. Should now work and be ready for benchmarks	2019-12-05 16:48:45 +02:00
jpekkila	f16826f2cd	Removed old code	2019-12-05 16:40:48 +02:00
jpekkila	9f4742bafe	Fixed the UCX warning from the last commit. Indexing of MPI_Waitall was wrong and also UCX required that MPI_Isend is also "waited" even though it should implicitly complete at the same time with MPI_Irecv	2019-12-05 16:40:30 +02:00
jpekkila	e47cfad6b5	MPI now compiles and runs on Puhti, basic verification test with boundary transfers OK. Gives an "UCX WARN object 0x2fa7780 was not returned to mpool ucp_requests" warning though which seems to indicate that not all asynchronous MPI calls finished before MPI_Finalize	2019-12-05 16:17:17 +02:00
jpekkila	9d70a29ae0	Now the minimum cmake version is 3.9. This is required for proper CUDA & MPI support. Older versions of cmake are very buggy when compiling cuda and it's a pain in the neck to try and work around all the quirks.	2019-12-05 15:35:51 +02:00
jpekkila	e99a428dec	OpenMP is now properly linked with the standalone without propagating it to nvcc (which would cause an error)	2019-12-05 15:30:48 +02:00
jpekkila	9adb9dc38a	Disabled MPI integration temporarily and enabled verification for MPI tests	2019-12-04 15:11:40 +02:00
jpekkila	6a250f0572	Rewrote core CMakeLists.txt for cmake versions with proper CUDA & MPI support (3.9+)	2019-12-04 15:09:38 +02:00
jpekkila	0ea2fa9337	Cleaner MPI linking with the core library. Requires cmake 3.9+ though, might have to modify later to work with older versions.	2019-12-04 13:49:38 +02:00

... 2 3 4 5 6 ...

804 Commits