Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Violence Recognition on Videos Using Two-stream 3D CNN with Custom Spatiotemporal Crop

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • معلومة اضافية
    • بيانات النشر:
      Research Square Platform LLC, 2022.
    • الموضوع:
      2022
    • نبذة مختصرة :
      Violence may happen anywhere. One of the ways to know and oversee the violence in some places is by installing Closed-circuit Television (CCTV). The recorded video captured by CCTV can be used as proof in a law court. Violence video classification is also one of the topics being discussed in deep learning. The latest violence video dataset is RWF-2000. That dataset contains violent and non-violent videos, 5 seconds duration, 30 frames per second, with the amount of 2000 videos. That publication also has the best accuracy of 87.25% by their proposed method. In this study, we will use a Residual Network known to have the advantage of solving the vanishing gradient problem. Beside that, we also implement transfer learning from Kinetics and Kinetics + Moments in Time pre-trained data. We also test the number of frames and the location of the sampling frame range. RGB and optical flow inputs are separately trained with different configurations. The RGB input best accuracy is 89.25% with pre-trained Kinetics + Moments in Time, using frame location 49-149. The optical flow input best accuracy is 88.5% with pre-trained Kinetics, using 74 frames. We also try to sum the output of both inputs making accuracy of 90.5%.
    • الرقم المعرف:
      10.21203/rs.3.rs-1947129/v2
    • Rights:
      OPEN
    • الرقم المعرف:
      edsair.doi.dedup.....dc1dcb85b89d8b42e94ee534afea9dd9