Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Parallel Architecture Design for OpenVX Kernel Image Processing Functions

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • بيانات النشر:
      Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press, 2022.
    • الموضوع:
      2022
    • Collection:
      LCC:Electronic computers. Computer science
    • نبذة مختصرة :
      Although the traditional programmable processors are highly flexible, their processing speed and perfor-mance are inferior to the application specific integrated circuit (ASIC). Image processing is often a diverse, intensive and repetitive operation, so the processor must balance speed, performance and flexibility. OpenVX is an open source standard for preprocessing or auxiliary processing of image processing, graph computing and deep learning applications. Aiming at the kernel visual function library of OpenVX 1.3 standard, this paper designs and implements a programmable and extensible OpenVX parallel processor. The architecture adopts an application specific instruction processor (ASIP). After analyzing and comparing the topological characteristics of various interconnection networks, the backbone of the ASIP chooses the hierarchically cross-connected Mesh+ (HCCM+) with outstanding performance, and processing element (PE) is set at network nodes. PE array is constructed to support dynamic configuration, and a parallel processor is designed to realize programmable image processing based on efficient routing and com-munication. The proposed architecture is suitable for data parallel computing and emerging graph computing. The two computing modes can be configured separately or mixed. The kernel visual function and graph computing model are mapped to the parallel processor respectively to verify the two modes and compare the image processing speed under different PE numbers. The results show that OpenVX parallel processor can complete the mapping and linear speedup of kernel functions and high complexity graph calculation model. The average speedup of scheduling 16 PEs to various functions is approximately 15.0375. When implemented on an FPGA board with a 20 nm XCVU440 device, the prototype can run at a frequency of 125 MHz.
    • File Description:
      electronic resource
    • ISSN:
      1673-9418
    • Relation:
      http://fcst.ceaj.org/fileup/1673-9418/PDF/2012085.pdf; https://doaj.org/toc/1673-9418
    • الرقم المعرف:
      10.3778/j.issn.1673-9418.2012085
    • الرقم المعرف:
      edsdoj.62bf2d46df7348489a75aefb29ecff5d