heavy: reduce by 256 threads default intensity to all -i 20
cuda: put static thread init bools outside the code (made once)
api: fix nvml header to build without
nvml.dll doesnt exists for 32bit binaries! use nvapi to get infos
seems to have more/different features than NVML... like pstate etc..
This is nvapi r343 : https://developer.nvidia.com/nvapi
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Based on mwhite73 <marvin.white@gmail.com> implementation
Linked to the api system
Also fix Makefile to support standard c++ files
This prevent nvcc use without device code
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>