Vector Addition with Kokkos

Performs vector addition in parallel, utilizing Kokkos for within-node and
Charm++ to run multiple processes (i.e. logical nodes) that can be executed
in a distributed memory environment.

Default Kokkos execution is OpenMP, but can be changed to use the available
GPU devices instead (with CUDA) by providing '-g' as a command line argument.

Requires Kokkos to be built for both OpenMP and CUDA.
e.g. From Kokkos source folder,
> mkdir build-cuda
> cd build-cuda
> generate_makefile.bash --prefix=<absolute path to build-cuda>
                         --with-cuda=<path to CUDA toolkit>
                         --with-cuda-options=enable_lambda
                         --with-openmp --arch=BDW,Pascal60
                         --compiler=<path to included NVCC wrapper>
> make -j kokkoslib
> make install

Path to OpenMP + CUDA build of Kokkos should be set in Makefile.common.
