Setting Up HIP with CUDA
Some notes on setting up CUDA and HIP for a system with an NVIDIA GPU or with an AMD GPU.
Some notes on setting up CUDA and HIP for a system with an NVIDIA GPU or with an AMD GPU.
Understanding sparse matrix formats with simple python code. By using ELLPACK format for representing sparse matrices, we can optimize the memory access pattern of the HPCG benchmark.
In the reference implementation of HPCG, SYMGS computation cannot be fully parallelized. However, we can use multi-coloring technique to parallelize the computation at the cost of relaxation of the algorithm.