نبذة مختصرة : This paper presents a theoretical perspective on modeling ventral stream processing by revisiting the computational abstraction of simple and complex cells. In parallel to David Marr's vision theory, we organize the new perspective into three levels. At the computational level, we abstract simple and complex cells into space partitioning and composition in a topological space based on the redundancy exploitation hypothesis of Horace Barlow. At the algorithmic level, we present a hierarchical extension of sparse coding by exploiting the manifold constraint in high-dimensional space (i.e., the blessing of dimensionality). The resulting over-parameterized models for object recognition differ from existing hierarchical models by disentangling the objectives of selectivity and invariance computation. It is possible to interpret our hierarchical construction as a computational implementation of cortically local subspace untangling for object recognition and face representation, which are closely related to exemplar-based and axis-based coding in the medial temporal lobe. At the implementation level, we briefly discuss two possible implementations based on asymmetric sparse autoencoders and divergent spiking neural networks.
No Comments.