# perfect | Branch | Status | |-|-| | master |[![Build Status](https://img.shields.io/endpoint.svg?url=https%3A%2F%2Factions-badge.atrox.dev%2Fcwpearson%2Fperfect%2Fbadge%3Fref%3Dmaster&style=flat)](https://actions-badge.atrox.dev/cwpearson/perfect/goto?ref=master)| CPU/GPU performance control library for benchmarking * x86 * POWER * Nvidia ## Features - [x] GPU power/utilization/temperature monitoring (nvidia) - [x] Disable CPU turbo (linux) - [x] Set OS CPU performance mode to maximum (linux) - [x] Set GPU clocks (nvidia) - [x] Disable GPU turbo (nvidia) - [x] Flush addresses from cache (amd64, POWER) - [x] CUDA not required (GPU functions will not be compiled) - [x] Flush file system caches (linux) ## Installing ### CMake Ensure you have CMake 3.13+. Add the source tree to your project and then use add_subdirectory ``` git submodule add git@github.com:cwpearson/perfect.git thirdparty/perfect ``` `CMakeLists.txt` ``` ... add_subdirectory(thirdparty/perfect) ... target_link_libraries(your-target perfect) ``` ### Without CMake Download the source **AND** * for compiling with a non-cuda compiler: * add the include directory to your includes * add `nvidia-ml` to your link flags * add `-DPERFECT_HAS_CUDA` to your compile definitions * with a CUDA compiler, just compile normally (`PERFECT_HAS_CUDA` is defined for you) ``` g++ code_using_perfect.cpp -DPERFECT_HAS_CUDA -Iperfect/include -lnvidia-ml nvcc code_using_perfect.cu -Iperfect/include -lnvidia-ml ``` If you don't have CUDA, then you could just do ``` g++ code_using_perfect.cpp -I perfect/include ``` ## Usage The `perfect` functions all return a `perfect::Result`, which is defined in [include/perfect/result.hpp]. When things are working, it will be `perfect::Result::SUCCESS`. A `PERFECT` macro is also defined, which will terminate with an error message unless the `perfect::Result` is `perfect::Result::SUCCESS`. ```c++ perfect::CpuTurboState state; PERFECT(perfect::get_cpu_turbo_state(&state)); ``` ## Monitoring `perfect` can monitor and record GPU activity. See [examples/gpu_monitor.cu](examples/gpu_monitor.cu) ```c++ #include "perfect/gpu_monitor.hpp" ``` * `Monitor(std::ostream *stream)`: create a monitor that will write to `stream`. * `void Monitor::start()`: start the monitor * `void Monitor::stop()`: terminate the monitor * `void Monitor::pause()`: pause the monitor thread * `void Monitor::resume()`: resume the monitor thread ### Flush file system caches `perfect` can drop various filesystem caches See [tools/sync_drop_caches.cpp](tools/sync_drop_caches.cpp) ```c++ #include "perfect/drop_caches.hpp" ``` * `Result sync()`: flush filesystem caches to disk * `Result drop_caches(DropCaches_t mode)`: remove file system caches * `mode = PAGECACHE`: drop page caches * `mode = ENTRIES`: drop dentries and inodes * `mode = PAGECACHE | ENTRIES`: both ### CPU Turbo `perfect` can enable and disable CPU boost through the Intel p-state mechanism or the ACPI cpufreq mechanism. See [examples/cpu_turbo.cpp](examples/cpu_turbo.cpp). ```c++ #include "perfect/cpu_turbo.hpp" ``` * `Result get_cpu_turbo_state(CpuTurboState *state)`: save the current CPU turbo state * `Result set_cpu_turbo_state(CpuTurboState *state)`: restore a saved CPU turbo state * `Result disable_cpu_turbo()`: disable CPU turbo * `Result enable_cpu_turbo()`: enable CPU turbo * `bool is_turbo_enabled(CpuTurboState state)`: check if turbo is enabled ### OS Performance `perfect` can control the OS governor on linux. See [examples/os_perf.cpp](examples/os_perf.cpp). ```c++ #include "perfect/os_perf.hpp" ``` * `Result get_os_perf_state(OsPerfState *state, const int cpu)`: Save the current OS governor mode for CPU `cpu`. * `Result os_perf_state_maximum(const int cpu)`: Set the OS governor to it's maximum performance mode. * `Result set_os_perf_state(const int cpu, OsPerfState state)`: Restore a previously-saved OS governor mode. ### GPU Turbo `perfect` can enable/disable GPU turbo boost. See [examples/gpu_turbo.cu](examples/gpu_turbo.cu). ```c++ #include "perfect/gpu_turbo.hpp" ``` * `Result get_gpu_turbo_state(GpuTurboState *state, unsigned int idx)`: Get the current turbo state for GPU `idx`, useful to restore later. * `bool is_turbo_enabled(GpuTurboState state)`: Check if turbo is enabled. * `Result set_gpu_turbo_state(GpuTurboState state, unsigned int idx)`: Set a previously saved turbo state. * `Result disable_gpu_turbo(unsigned int idx)`: Disable GPU `idx` turbo. * `Result enable_gpu_turbo(unsigned int idx)`: Enable GPU `idx` turbo. ### GPU Clocks `perfect` can lock GPU clocks to their maximum values. See [examples/gpu_clocks.cu](examples/gpu_clocks.cu). ```c++ #include "perfect/gpu_clocks.hpp" ``` * `Result set_max_gpu_clocks(unsigned int idx)`: Set GPU `idx` clocks to their maximum reported values. * `Result reset_gpu_clocks(unsigned int idx)`: Unset GPU `idx` clocks. ### CPU Cache `perfect` can flush data from CPU caches. Unlike the other APIs, these do not return a `Result` because they do not fail. See [examples/cpu_cache.cpp](examples/cpu_cache.cpp). ```c++ #include "perfect/cpu_cache.hpp" ``` * `void flush_all(void *p, const size_t n)`: Flush all cache lines starting at `p` for `n` bytes. ## Changelog * v0.3.0 * Add filesystem cache interface * v0.2.0 * add GPU monitoring * Make CUDA optional * v0.1.0 * cache control * Intel P-State control * linux governor control * POWER cpufreq control * Nvidia GPU boost control * Nvidia GPU clock control ## Wish List - [ ] only monitor certain GPUs - [ ] A wrapper utility - [ ] disable hyperthreading - [ ] reserve cores - [ ] set process priority - [ ] disable ASLR ## Related * [LLVM benchmarking instructions](https://llvm.org/docs/Benchmarking.html#linux) covering ASLR, Linux governor, cpuset shielding, SMT, and Intel turbo. * [easyperf.net](https://easyperf.net/blog/2019/08/02/Perf-measurement-environment-on-Linux#2-disable-hyper-threading) blog post discussing ACPI/Intel turbo, SMT, Linux governor, CPU affinity, process priority, file system caches, and ASLR.