Evaluating the soft error sensitivity of a GPU-based SoC for matrixmultiplication

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: León, Germán; Badía, José; BELLOCH, JOSE A.; LINDOSO, ALMUDENA; Entrena, Luis
الموضوع:
GPU; Soft Errors; Sensitivity; Fault injection
نوع التسجيلة:
article in journal/newspaper
اللغة:
English

معلومة اضافية
- بيانات النشر:
  Elsevier
- الموضوع:
  2021
- Collection:
  Repositori Universitat Jaume I (Repositorio UJI)
- نبذة مختصرة :
  System-on-Chip (SoC) devices can be composed of low-power multicore processors combined with a small graphics accelerator (or GPU) which offers a trade-off between computational capacity and low-power consumption. In this work we use the LLFI-GPU fault injection tool on one of these devices to compare the sensitivity to soft errors of two different CUDA versions of matrix multiplication benchmark. Specifically, we perform fault injection campaigns on a Jetson TK1 development kit, a board equipped with a SoC including an NVIDIA ”Kepler“ Graphics Processing Unit (GPU). We evaluate the effect of modifying the size of the problem and also the thread-block size on the behaviour of the algorithms. Our results show that the block version of the matrix multiplication benchmark that leverages the shared memory of the GPU is not only faster than the element-wise version, but it is also much more resilient to soft errors. We also use the cuda-gdb debugger to analyze the main causes of the crashes in the code due to soft errors. Our experiments show that most of the errors are due to accesses to invalid positions of the different memories of the GPU, which causes that the block version suffers a higher percentage of this kind of errors.
- File Description:
  application/pdf
- ISSN:
  0026-2714
- Relation:
  Microelectronics Reliability, 2020, vol. 114.; https://www.sciencedirect.com/science/article/pii/S0026271420304558; LEÓN, Germán, et al. Evaluating the soft error sensitivity of a GPU-based SoC for matrix multiplication. Microelectronics Reliability, 2020, vol. 114, p. 113856.; http://hdl.handle.net/10234/192517; https://doi.org/10.1016/j.microrel.2020.113856
- الرقم المعرف:
  10.1016/j.microrel.2020.113856
- الدخول الالكتروني :
  https://doi.org/10.1016/j.microrel.2020.113856
  http://hdl.handle.net/10234/192517
- الرقم المعرف:
  edsbas.2E5EDE5C

تعليقات

No Comments.

Evaluating the soft error sensitivity of a GPU-based SoC for matrixmultiplication

اتصل بنا

اتبع