Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

METHOD FOR TRANING A PEDESTRIAN DETECTION MODEL, PEDESTRIAN DETECTION METHOD, ELECTRONIC DEVICE, AND COMPUTER-READABLE STORAGE MEDIUM

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Publication Date:
    December 12, 2024
  • معلومة اضافية
    • Document Number:
      20240412490
    • Appl. No:
      18/810992
    • Application Filed:
      August 21, 2024
    • نبذة مختصرة :
      This application provides a pedestrian detection method, a method for training a pedestrian detection model, an electronic device, and a computer-readable storage medium. The pedestrian detection method includes: acquiring an image to be recognized; inputting the image into a multi-task recognition network to acquire a predicted pedestrian location and a predicted pedestrian attribute in parallel; and correlating the predicted pedestrian location and the predicted pedestrian attribute to output a detection result. The pedestrian detection method utilizes the multi-task recognition network to acquire the predicted pedestrian location and the predicted pedestrian attribute in parallel, and the detection result containing the predicted pedestrian location and the predicted pedestrian attribute is acquired only by inputting the image to be recognized once, thereby improving detection efficiency, quickly obtaining the detection result and saving device resources.
    • Claim:
      1. A pedestrian detection method, comprising: acquiring an image to be recognized; inputting the image to be recognized into a pre-trained multi-task recognition network, the multi-task recognition network comprising a backbone network, a pedestrian detection network and an attribute recognition network; acquiring a backbone feature map according to the input image to be recognized based on the backbone network; acquiring a predicted pedestrian location according to the backbone feature map based on the pedestrian detection network; acquiring a predicted pedestrian attribute according to the backbone feature map based on the attribute recognition network; and correlating the predicted pedestrian location and the predicted pedestrian attribute to output a detection result.
    • Claim:
      2. The pedestrian detection method according to claim 1, wherein the acquiring a predicted pedestrian location according to the backbone feature map based on the pedestrian detection network, comprises: converging at least two backbone feature maps with different resolutions based on a first converged network of the pedestrian detection network to acquire a plurality of first converged feature maps, the plurality of first converged feature maps having different resolutions; and acquiring the predicted pedestrian location according to the plurality of first converged feature maps input based on a first task head network of the pedestrian detection network.
    • Claim:
      3. The pedestrian detection method according to claim 1, wherein the acquiring a predicted pedestrian attribute according to the backbone feature map based on the attribute recognition network, comprises: converging at least two backbone feature maps with different resolutions based on a second converged network of the attribute recognition network to acquire a plurality of second converged feature maps, the plurality of second converged feature maps having different resolutions; and acquiring the predicted pedestrian attribute according to the plurality of second converged feature maps input based on a second task head network of the attribute recognition network.
    • Claim:
      4. The pedestrian detection method according to claim 1, wherein the correlating the predicted pedestrian location and the predicted pedestrian attribute to output a detection result, comprises: outputting the detection result having a detection box if a predicted pedestrian score of the predicted pedestrian location is greater than a preset pedestrian score; and outputting the detection result having a visible attribute if a predicted attribute score of the predicted pedestrian attribute is greater than a preset attribute score.
    • Claim:
      5. A method for training a pedestrian detection model, comprising: constructing a multi-task recognition network, the multi-task recognition network comprising a backbone network, a pedestrian detection network and an attribute recognition network; acquiring a training set image; and inputting the training set image into the multi-task recognition network for training to obtain the pedestrian detection model; wherein: the backbone network is configured for acquiring a backbone feature map according to the training set image; the pedestrian detection network is configured for acquiring a predicted pedestrian location according to the backbone feature map; and the attribute recognition network is configured for acquiring a predicted pedestrian attribute according to the backbone feature map.
    • Claim:
      6. The method for training a pedestrian detection model according to claim 5, wherein the inputting the training set image into the multi-task recognition network for training to obtain the pedestrian detection model, comprises: acquiring a training location feature map and a training attribute feature map based on the training set image; calculating total loss according to the training location feature map and the training attribute feature map; and updating preset parameters of the pedestrian detection model according to the total loss to obtain the pedestrian detection model.
    • Claim:
      7. The method for training a pedestrian detection model according to claim 6, wherein the calculating total loss according to the training location feature map and the training attribute feature map, comprises: determining a positive sample in the training location feature map, the positive sample comprising a pedestrian location box; calculating a pedestrian classification loss and a pedestrian localization loss according to the positive sample; calculating pedestrian attribute loss according to the pedestrian location box, the training attribute feature map and the positive sample; and calculating the total loss according to the pedestrian classification loss, the pedestrian localization loss and the pedestrian attribute loss.
    • Claim:
      8. The method for training a pedestrian detection model according to claim 7, wherein the calculating pedestrian attribute loss according to the pedestrian location box, the training attribute feature map and the positive sample, comprises: acquiring a predicted result value, a target score value, a target intersection over union and a target proportion according to the pedestrian location box, the training attribute feature map and the positive sample, the predicted result value representing a pedestrian attribute probability, the target score value representing the pedestrian attribute probability in a label of the training set image, the target intersection over union being an intersection over union of the pedestrian location box and a label location box of the training set image, and the target proportion representing a proportion occupied by the positive sample corresponding to the jth pedestrian attribute in the training set image; and calculating the pedestrian attribute loss by utilizing a pedestrian attribute loss function based on preset hyperparameters, a preset activation function, a preset number of human attribute classes, the predicted result value, the target score value and the target intersection over union; wherein the pedestrian attribute loss function comprises: [mathematical expression included] [mathematical expression included] [mathematical expression included] [mathematical expression included] [mathematical expression included] where LPAR is the pedestrian attribute loss, M is the preset number of human attribute classes, targets is the target score value, σ is the preset activation function, pred is the predicted result value, α and γ are the preset hyperparameters, IOU is the target intersection over union, and rj is the target proportion.
    • Claim:
      9. The method for training a pedestrian detection model according to claim 7, wherein the preset parameters comprise a weight of a detection task and a weight of a pedestrian attribute recognition task, and a loss function corresponding to the total loss comprises: [mathematical expression included] where Lcls represents the pedestrian classification loss, Lciou represents the pedestrian localization loss, LPAR represents the pedestrian attribute loss, Wdet represents the weight of the detection task, and WPAR represents the weight of the pedestrian attribute recognition task.
    • Claim:
      10. The method for training a pedestrian detection model according to claim 7, wherein the determining a positive sample in the training location feature map, comprises: acquiring a training location box according to the training location feature map; acquiring a label location box according to a verification set image; calculating an intersection over union of the training location box and the label location box; determining a positive sample quantity value according to the intersection over union; determining a positive sample candidate region based on a center prior, and calculating a cost matrix corresponding to the positive sample candidate region; and determining the positive sample according to the cost matrix and the positive sample quantity value.
    • Claim:
      11. An electronic device, comprising: a memory configured to store a plurality of computer program instructions; and a processor coupled to the memory and configured to execute the computer program instructions stored in the memory to cause the electronic device to: acquire an image to be recognized; input the image to be recognized into a pre-trained multi-task recognition network, the multi-task recognition network comprising a backbone network, a pedestrian detection network and an attribute recognition network; acquire a backbone feature map according to the input image to be recognized based on the backbone network; acquire a predicted pedestrian location according to the backbone feature map based on the pedestrian detection network; acquire a predicted pedestrian attribute according to the backbone feature map based on the attribute recognition network; and correlate the predicted pedestrian location and the predicted pedestrian attribute to output a detection result.
    • Claim:
      12. The electronic device according to claim 11, wherein the processor further executes the instructions to cause the electronic device to: converge at least two backbone feature maps with different resolutions based on a first converged network of the pedestrian detection network, to acquire a plurality of first converged feature maps, the plurality of first converged feature maps having different resolutions; and acquire the predicted pedestrian location according to the plurality of first converged feature maps input based on a first task head network of the pedestrian detection network.
    • Claim:
      13. The electronic device according to claim 11, wherein the processor further executes the instructions to cause the electronic device to: converge at least two backbone feature maps with different resolutions based on a second converged network of the attribute recognition network, to acquire a plurality of second converged feature maps, the plurality of second converged feature maps having different resolutions; and acquire the predicted pedestrian attribute according to the plurality of second converged feature maps input based on a second task head network of the attribute recognition network.
    • Claim:
      14. The electronic device according to claim 11, wherein the processor further executes the instructions to cause the electronic device to: output the detection result having a detection box if a predicted pedestrian score of the predicted pedestrian location is greater than a preset pedestrian score; and output the detection result having a visible attribute if a predicted attribute score of the predicted pedestrian attribute is greater than a preset attribute score.
    • Claim:
      15. An electronic device, comprising: a memory configured to store a plurality of computer program instructions; and a processor coupled to the memory and configured to execute the computer program instructions stored in the memory to perform the method for training a pedestrian detection model according to claim 5.
    • Claim:
      16. A non-transitory computer-readable storage medium having a plurality of computer program instructions stored thereon, wherein the program instructions, when executed by one or more processors, cause the one or more processors to implement the pedestrian detection method according to claim 1.
    • Claim:
      17. A non-transitory computer-readable storage medium having a plurality of computer program instructions stored thereon, wherein the program instructions, when executed by one or more processors, cause the one or more processors to implement the method for training a pedestrian detection model according to claim 5.
    • Current International Class:
      06; 06; 06; 06; 06; 06
    • الرقم المعرف:
      edspap.20240412490