Item request has been placed! ×
Item request cannot be made. ×
loading  Processing Request

Systems and methods for reconstructing body shape and pose

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Publication Date:
    February 20, 2024
  • معلومة اضافية
    • Patent Number:
      11908,071
    • Appl. No:
      17/495960
    • Application Filed:
      October 07, 2021
    • نبذة مختصرة :
      The present disclosure is generally directed to reconstructing representations of bodies from images. An example method of the present disclosure includes inputting, into a machine-learned reconstruction model, input data descriptive of an image depicting a body; predicting, using a machine-learned marker prediction component of the reconstruction model, a set of surface marker locations on the body; and outputting, using a machine-learned marker poser component of the reconstruction model, an output representation of the body that corresponds to the set of surface marker locations. In the example method, one or more parameters of the reconstruction model were learned at least in part based on a consistency loss corresponding to a distance between relaxed-constraint representations generated from a prior set of surface marker locations predicted according to the one or more parameters and parametric representations generated from the prior set using kinematic constraints associated with the body.
    • Inventors:
      Google LLC (Mountain View, CA, US)
    • Assignees:
      GOOGLE LLC (Mountain View, CA, US)
    • Claim:
      1. A computer-implemented method for reconstructing representations of bodies from images, the method comprising: inputting, by one or more computing devices into a machine-learned reconstruction model, input data descriptive of an image depicting a body; predicting, by the one or more computing devices and using a machine-learned marker prediction component of the reconstruction model, a set of surface marker locations on the body; and outputting, by the one or more computing devices and using a machine-learned marker poser component of the reconstruction model, an output representation of the body that corresponds to the set of surface marker locations; wherein one or more parameters of the reconstruction model were learned at least in part based on a consistency loss corresponding to a distance between (i) a relaxed-constraint representation generated from a prior set of surface marker locations predicted according to the one or more parameters and (ii) a parametric representation generated from the prior set using kinematic constraints associated with the body.
    • Claim:
      2. The computer-implemented method of claim 1 , wherein the body is a human body and the kinematic constraints correspond to anthropometric constraints.
    • Claim:
      3. The computer-implemented method of claim 1 , wherein the output representation is a relaxed-constraint representation.
    • Claim:
      4. The computer-implemented method of claim 1 , wherein the marker prediction component comprises one or more encoder layers.
    • Claim:
      5. The computer-implemented method of claim 4 , wherein the one or more encoder layers respectively comprise self-attention models.
    • Claim:
      6. The computer-implemented method of claim 5 , wherein predicting the set of surface marker locations comprises: encoding, by the one or more computing devices and using the one or more encoder layers, a surface marker embedding along with the input data; and updating, by the one or more computing devices, the set of surface marker locations based at least in part on the encoded surface marker embedding.
    • Claim:
      7. The computer-implemented method of claim 6 , wherein an output of each of the one or more encoder layers is used to iteratively refine the set of surface marker locations, the output corresponding to the surface marker embedding.
    • Claim:
      8. The computer-implemented method of claim 7 , wherein the one or more encoder layers comprise a plurality of encoder layers that share one or more machine-learned weights.
    • Claim:
      9. The computer-implemented method of claim 1 , comprising: transforming, by the one or more computing devices and using a capture model, the input data; and wherein the output representation is obtained in a capture space corresponding to the capture model.
    • Claim:
      10. The computer-implemented method of claim 9 , wherein the capture model is based at least in part on a perspective model.
    • Claim:
      11. A system for reconstructing representations of bodies from images, comprising: one or more processors; and one or more memory devices storing computer-readable instructions that, when implemented, cause the one or more processors to perform operations, the operations comprising: inputting, into a machine-learned reconstruction model, input data descriptive of an image depicting a body; predicting, using a machine-learned marker prediction component of the reconstruction model, a set of surface marker locations on the body; and outputting, using a machine-learned marker poser component of the reconstruction model, an output representation of the body that corresponds to the set of surface marker locations; wherein one or more parameters of the reconstruction model were learned at least in part based on a consistency loss corresponding to a distance between (i) a relaxed-constraint representation generated from a prior set of surface marker locations predicted according to the one or more parameters and (ii) a parametric representation generated from the prior set using kinematic constraints associated with the body.
    • Claim:
      12. The system of claim 11 , wherein the body is a human body and the kinematic constraints correspond to anthropometric constraints.
    • Claim:
      13. The system of claim 11 , wherein the output representation is a relaxed- constraint representation.
    • Claim:
      14. The system of claim 11 , wherein the marker prediction component comprises one or more encoder layers.
    • Claim:
      15. The system of claim 14 , wherein the one or more encoder layers respectively comprise self-attention models.
    • Claim:
      16. The system of claim 15 , wherein predicting the set of surface marker locations comprises: encoding, using the one or more encoder layers, a surface marker embedding along with the input data; and updating the set of surface marker locations based at least in part on the encoded surface marker embedding.
    • Claim:
      17. The system of claim 16 , wherein an output of each of the one or more encoder layers is used to iteratively refine the set of surface marker locations, the output corresponding to the surface marker embedding.
    • Claim:
      18. The system of claim 17 , wherein the one or more encoder layers comprise a plurality of encoder layers that share one or more machine-learned weights.
    • Claim:
      19. A system for reconstructing representations of bodies from images, comprising: one or more processors; and one or more memory devices storing computer-readable instructions that, when implemented, cause the one or more processors to perform operations, the operations comprising: inputting, into a machine-learned marker prediction model, input data descriptive of an image depicting a body; predicting, using the marker prediction model, a set of surface marker locations on the body; outputting, using a machine-learned marker poser model, a parametric representation of the body that corresponds to the set of surface marker locations; and updating one or more parameters of the marker prediction model based at least in part on a consistency loss corresponding to a distance between the parametric representation and a relaxed-constraint representation associated with the predicted set of surface marker locations.
    • Claim:
      20. The system of claim 19 , wherein the output representation is a relaxed-constraint representation.
    • Patent References Cited:
      10849585 December 2020 Teixeira
      20110123088 May 2011 Sebok
      20210227152 July 2021 Zhang
      20220189056 June 2022 Li
    • Other References:
      Xu et al., “GHUM & GHUML: Generative 3D Human Shape and Articulated Pose Models”, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 13-19, 2020, Seattle, Washington, pp. 6184-6193. cited by applicant
    • Primary Examiner:
      Hsu, Joni
    • Attorney, Agent or Firm:
      Dority & Manning, P.A.
    • الرقم المعرف:
      edspgr.11908071