Two novel cache management mechanisms on CPU-GPU heterogeneous processors

Authors : Huijing Yang; Tingwen Yu


Abstract

Heterogeneous multicore processors that take full advantage of CPUs and GPUs within the same chip raise an emerging challenge for sharing a series of on-chip resources, particularly Last-Level Cache (LLC) resources. Since the GPU core has good parallelism and memory latency tolerance, the majority of the LLC space is utilized by GPU applications. Under the current cache management policies, the LLC sharing of CPU applications can be remarkably decreased due to the existence of GPU workloads, thus seriously affecting the overall performance. To alleviate the unfair contention within CPUs and GPUs for the cache capability, we propose two novel cache supervision mechanisms: static cache partitioning scheme based on adaptive replacement policy (SARP) and dynamic cache partitioning scheme based on GPU missing awareness (DGMA). SARP scheme first uses cache partitioning to split the cache ways between CPUs and GPUs and then uses adaptive cache replacement policy depending on the type of the requested message. DGMA scheme monitors GPU’s cache performance metrics at run time and set appropriate threshold to dynamically change the cache ratio of the mutual LLC between various kernels. Experimental results show that SARP mechanism can further increase CPU performance, up to 32.6$\%$ and an average increase of 8.4$\%$. And DGMA scheme improves CPU performance under the premise of ensuring that GPU performance is not affected, and achieves a maximum increase of 18.1% and an average increase of 7.7%.

Keywords: heterogeneous; multicore; CPU-GPU; cache partitioning;.

 

Research Briefs on Information & Communication Technology Evolution (ReBICTE)
Vol. 7, No. 1, pp. 1-8, June 15, 2021 [pdf]

DOI: 10.22667/ReBiCTE.2021.06.15.001