Various small improvements to the website (navigation panel, better headings, formatting, etc)

This commit is contained in:
jpekkila
2020-01-14 14:44:06 +02:00
parent d947bdccb8
commit 37cafd26aa
6 changed files with 47 additions and 72 deletions

View File

@@ -1,6 +1,3 @@
Contributing
============
# Contributing
Contributions to Astaroth are very welcome!

View File

@@ -1,9 +1,4 @@
Astaroth Documentation {#mainpage}
============
![Astaroth Sigil](doc/astaroth_logo.svg)
# Astaroth - A Multi-GPU Library for Generic Stencil Computations
# Astaroth - A Multi-GPU Library for Generic Stencil Computations {#mainpage}
[Specification](doc/Astaroth_API_specification_and_user_manual/API_specification_and_user_manual.md) | [Contributing](CONTRIBUTING.md) | [Licence](LICENCE.md) | [Issue Tracker](https://bitbucket.org/jpekkila/astaroth/issues?status=new&status=open) | [Wiki](https://bitbucket.org/jpekkila/astaroth/wiki/Home)

View File

@@ -1,6 +1,3 @@
Astaroth Specification and User Manual
============
# Astaroth Specification and User Manual
Copyright (C) 2014-2020, Johannes Pekkila, Miikka Vaisala.
@@ -52,17 +49,17 @@ usable via the Astaroth API. While the Astaroth library is written in C++/CUDA,
the C99 standard.
# Publications
## Publications
The foundational work was done in (Väisälä, Pekkilä, 2017) and the library, API and DSL described
in this document were introduced in (Pekkilä, 2019). We kindly wish the users of Astaroth to cite
to these publications in their work.
> J. Pekkilä, Astaroth: A Library for Stencil Computations on Graphics Processing Units. Master's thesis, Aalto University School of Science, Espoo, Finland, 2019.
> [J. Pekkilä, Astaroth: A Library for Stencil Computations on Graphics Processing Units. Master's thesis, Aalto University School of Science, Espoo, Finland, 2019.](http://urn.fi/URN:NBN:fi:aalto-201906233993)
> M. S. Väisälä, Magnetic Phenomena of the Interstellar Medium in Theory and Observation. PhD thesis, University of Helsinki, Finland, 2017.
> [M. S. Väisälä, Magnetic Phenomena of the Interstellar Medium in Theory and Observation. PhD thesis, University of Helsinki, Finland, 2017.](http://urn.fi/URN:ISBN:978-951-51-2778-5)
> J. Pekkilä, M. S. Väisälä, M. Käpylä, P. J. Käpylä, and O. Anjum, “Methods for compressible fluid simulation on GPUs using high-order finite differences, ”Computer Physics Communications, vol. 217, pp. 1122, Aug. 2017.
> [J. Pekkilä, M. S. Väisälä, M. Käpylä, P. J. Käpylä, and O. Anjum, “Methods for compressible fluid simulation on GPUs using high-order finite differences, ”Computer Physics Communications, vol. 217, pp. 1122, Aug. 2017.](https://doi.org/10.1016/j.cpc.2017.03.011)
@@ -218,9 +215,10 @@ AcResult acDeviceLoadMeshInfo(const Device device, const Stream stream,
const AcMeshInfo device_config);
```
### Integration, Reductions and Boundary Conditions
### Computation
The library provides the following functions for integration, reductions and computing periodic
boundary conditions.
```C
AcResult acDeviceIntegrateSubstep(const Device device, const Stream stream, const int step_number,
const int3 start, const int3 end, const AcReal dt);
@@ -248,7 +246,16 @@ AcResult acNodeReduceVec(const Node node, const Stream stream_type, const Reduct
const VertexBufferHandle vtxbuf2, AcReal* result);
```
### Stream Synchronization
Finally, there's a library function that is automatically generated for all user-specified `Kernel`
functions written with the Astaroth DSL,
```C
AcResult acDeviceKernel_##identifier(const Device device, const Stream stream,
const int3 start, const int3 end);
```
Where `##identifier` is replaced with the name of the user-specified kernel. For example, a device
function `Kernel solve()` can be called with `acDeviceKernel_solve()` via the API.
## Stream Synchronization
All library functions that take a `Stream` as a parameter are asynchronous. When calling these
functions, control returns immediately back to the host even if the called device function has not
@@ -270,13 +277,20 @@ synchronized at once by passing the alias `STREAM_ALL` to the synchronization fu
Usage of streams is demonstrated with the following example.
```C
funcA(STREAM_0);
funcB(STREAM_0); // Blocks until funcA has completed
funcC(STREAM_1); // May execute in parallel with funcB
funcB(STREAM_0); // Blocks until funcA has completed
funcC(STREAM_1); // May execute in parallel with funcB
barrierSynchronizeStream(STREAM_ALL); // Blocks until functions in all streams have completed
funcD(STREAM_2); // Is started when command returns from synchronizeStream()
funcD(STREAM_2); // Is started when command returns from synchronizeStream()
```
### Data Synchronization
Astaroth API provides the following functions for barrier synchronization.
```C
AcResult acSynchronize(void);
AcResult acNodeSynchronizeStream(const Node node, const Stream stream);
AcResult acDeviceSynchronizeStream(const Device device, const Stream stream);
```
## Data Synchronization
Stream synchronization works in the same fashion on node and device layers. However on the node
layer, one has to take in account that a portion of the mesh is shared between devices and that the
@@ -296,7 +310,7 @@ AcResult acNodeSynchronizeVertexBuffer(const Node node, const Stream stream,
> **NOTE**: Local halos must be up to date before synchronizing the data. Local halos are the grid points outside the computational domain which are used only by a single device. The mesh is distributed to multiple devices by blocking along the z axis. If there are *n* devices and the z-dimension of the computational domain is *nz*, then each device is assigned *nz / n* two-dimensional planes. For example with two devices, the data block that has to be up to date ranges from *(0, 0, nz)* to *(mx, my, nz + 2 * NGHOST)*.
### Input and Output Buffers
## Input and Output Buffers
The mesh is duplicated to input and output buffers for performance reasons. The input buffers are
read-only in user-specified compute kernels, which allows us to read them via the texture cache
@@ -357,14 +371,14 @@ Meshes are the primary structures for passing information to the library and ker
of a `Mesh` is declared as
```C
typedef struct {
int int_params[NUM_INT_PARAMS];
int3 int3_params[NUM_INT3_PARAMS];
AcReal real_params[NUM_REAL_PARAMS];
int int_params[NUM_INT_PARAMS];
int3 int3_params[NUM_INT3_PARAMS];
AcReal real_params[NUM_REAL_PARAMS];
AcReal3 real3_params[NUM_REAL3_PARAMS];
} AcMeshInfo;
typedef struct {
AcReal* vertex_buffer[NUM_VTXBUF_HANDLES];
AcReal* vertex_buffer[NUM_VTXBUF_HANDLES];
AcMeshInfo info;
} AcMesh;
```
@@ -415,45 +429,7 @@ Let *i* be the device id. The portion of the halos shared by neighboring devices
`acNodeSynchronizeVertexBuffer` and `acNodeSynchronizeMesh` communicate these shared areas among
the devices in the node.
## Integration, Reductions and Boundary Conditions
The library provides the following functions for integration, reductions and computing periodic
boundary conditions.
```C
AcResult acDeviceIntegrateSubstep(const Device device, const Stream stream, const int step_number,
const int3 start, const int3 end, const AcReal dt);
AcResult acDevicePeriodicBoundcondStep(const Device device, const Stream stream,
const VertexBufferHandle vtxbuf_handle, const int3 start,
const int3 end);
AcResult acDevicePeriodicBoundconds(const Device device, const Stream stream, const int3 start,
const int3 end);
AcResult acDeviceReduceScal(const Device device, const Stream stream, const ReductionType rtype,
const VertexBufferHandle vtxbuf_handle, AcReal* result);
AcResult acDeviceReduceVec(const Device device, const Stream stream_type, const ReductionType rtype,
const VertexBufferHandle vtxbuf0, const VertexBufferHandle vtxbuf1,
const VertexBufferHandle vtxbuf2, AcReal* result);
AcResult acNodeIntegrateSubstep(const Node node, const Stream stream, const int step_number,
const int3 start, const int3 end, const AcReal dt);
AcResult acNodeIntegrate(const Node node, const AcReal dt);
AcResult acNodePeriodicBoundcondStep(const Node node, const Stream stream,
const VertexBufferHandle vtxbuf_handle);
AcResult acNodePeriodicBoundconds(const Node node, const Stream stream);
AcResult acNodeReduceScal(const Node node, const Stream stream, const ReductionType rtype,
const VertexBufferHandle vtxbuf_handle, AcReal* result);
AcResult acNodeReduceVec(const Node node, const Stream stream_type, const ReductionType rtype,
const VertexBufferHandle vtxbuf0, const VertexBufferHandle vtxbuf1,
const VertexBufferHandle vtxbuf2, AcReal* result);
```
Finally, there's a library function that is automatically generated for all user-specified `Kernel`
functions written with the Astaroth DSL,
```C
AcResult acDeviceKernel_##identifier(const Device device, const Stream stream,
const int3 start, const int3 end);
```
Where `##identifier` is replaced with the name of the user-specified kernel. For example, a device
function `Kernel solve()` can be called with `acDeviceKernel_solve()` via the API.
> **NOTE:** The decomposition scheme is subject to change.
# Astaroth Domain-Specific Language

BIN
doc/astaroth_logo_small.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.5 KiB

View File

@@ -38,7 +38,7 @@ PROJECT_NAME = "Astaroth"
# could be handy for archiving the generated documentation or if some version
# control system is used.
PROJECT_NUMBER =
PROJECT_NUMBER = 2.1
# Using the PROJECT_BRIEF tag one can provide an optional one line description
# for a project that appears at the top of each page and should give viewer a
@@ -51,7 +51,7 @@ PROJECT_BRIEF =
# pixels and the maximum width should not exceed 200 pixels. Doxygen will copy
# the logo to the output directory.
PROJECT_LOGO =
PROJECT_LOGO = doc/astaroth_logo_small.png
# The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) path
# into which the generated documentation will be written. If a relative path is
@@ -242,7 +242,7 @@ TCL_SUBST =
# members will be omitted, etc.
# The default value is: NO.
OPTIMIZE_OUTPUT_FOR_C = NO
OPTIMIZE_OUTPUT_FOR_C = YES
# Set the OPTIMIZE_OUTPUT_JAVA tag to YES if your project consists of Java or
# Python sources only. Doxygen will then generate output that is more tailored
@@ -1187,7 +1187,7 @@ HTML_TIMESTAMP = NO
# The default value is: NO.
# This tag requires that the tag GENERATE_HTML is set to YES.
HTML_DYNAMIC_SECTIONS = NO
HTML_DYNAMIC_SECTIONS = YES
# With HTML_INDEX_NUM_ENTRIES one can control the preferred number of entries
# shown in the various tree structured indices initially; the user can expand
@@ -1416,7 +1416,7 @@ DISABLE_INDEX = NO
# The default value is: NO.
# This tag requires that the tag GENERATE_HTML is set to YES.
GENERATE_TREEVIEW = NO
GENERATE_TREEVIEW = YES
# The ENUM_VALUES_PER_LINE tag can be used to set the number of enum values that
# doxygen will group on one line in the generated HTML documentation.

View File

@@ -16,6 +16,13 @@
You should have received a copy of the GNU General Public License
along with Astaroth. If not, see <http://www.gnu.org/licenses/>.
*/
/**
* @file Single-Device Interface
* \brief Provides functions for controlling a single device.
*
* Detailed info.
*
*/
#pragma once
#ifdef __cplusplus