Large Kernel Sparse ConvNet Weighted by Multi-Frequency Attention for Remote Sensing Scene Understanding

Item request has been placed!

Item request cannot be made.

Processing Request

اقرأ أكثر حفظ في قائمتي

المؤلفون: Wang, Junjie; Li, Wei; Zhang, Mengmeng; Chanussot, Jocelyn
المصدر:
ISSN: 0196-2892 ; IEEE Transactions on Geoscience and Remote Sensing ; https://hal.science/hal-04473702 ; IEEE Transactions on Geoscience and Remote Sensing, 2023, 61, pp.5626112. ⟨10.1109/TGRS.2023.3333401⟩.
الموضوع:
Remote sensing; scene understanding; large kernel convolution; adaptive sparse optimization; multi-frequency attention; [INFO.INFO-TI]Computer Science [cs]/Image Processing [eess.IV]; [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
نوع التسجيلة:
article in journal/newspaper
اللغة:
English

معلومة اضافية
- Contributors:
  Beijing Institute of Technology (BIT); GIPSA - Signal Images Physique (GIPSA-SIGMAPHY); Observatoire des Sciences de l'Univers de Grenoble (OSUG )-GIPSA Pôle Sciences des Données (GIPSA-PSD); Grenoble Images Parole Signal Automatique (GIPSA-lab); Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ); Université Grenoble Alpes (UGA)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ); Université Grenoble Alpes (UGA)-Grenoble Images Parole Signal Automatique (GIPSA-lab); Université Grenoble Alpes (UGA)-Observatoire des Sciences de l'Univers de Grenoble (OSUG ); Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut national des sciences de l'Univers (INSU - CNRS)-Institut national de recherche en sciences et technologies pour l'environnement et l'agriculture (IRSTEA)-Université Savoie Mont Blanc (USMB Université de Savoie Université de Chambéry )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes 2016-2019 (UGA 2016-2019 )-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut national des sciences de l'Univers (INSU - CNRS)-Institut national de recherche en sciences et technologies pour l'environnement et l'agriculture (IRSTEA)-Université Savoie Mont Blanc (USMB Université de Savoie Université de Chambéry )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes 2016-2019 (UGA 2016-2019 ); Apprentissage de modèles à partir de données massives (Thoth); Inria Grenoble - Rhône-Alpes; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Jean Kuntzmann (LJK); Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ); Université Grenoble Alpes (UGA); ANR-19-P3IA-0003,MIAI,MIAI @ Grenoble Alpes(2019)
- بيانات النشر:
  HAL CCSD
  Institute of Electrical and Electronics Engineers
- الموضوع:
  2023
- Collection:
  Institut National de la Recherche Agronomique: ProdINRA
- نبذة مختصرة :
  International audience ; Remote sensing scene understanding is a highly challenging task, and has gradually emerged as a research hotspot in the field of intelligent interpretation of remote sensing data. Recently, the use of convolutional neural networks (CNNs) has been proven to be a fruitful advancement. However, with the emergence of visual transformers (ViTs), the limitations of traditional small convolutional kernels in directly capturing a large receptive field have posed significant challenges to their dominant role. Additionally, the fixed neuron connections between different convolutional layers have weakened the practicality and adaptability of the models. Furthermore, the global average pooling (GAP) also leads to the loss of effective information in the acquired features. In this work, a large kernel sparse ConvNet (LSCNet) weighted by multi-frequency attention (MFA) is proposed. First, unlike traditional CNNs, it utilizes two parallel rectangular convolutional kernels to approximate a large kernel, achieving comparable or even better results than ViTs-based methods. Second, an adaptive sparse optimization strategy is employed to dynamically optimize the fixed neuron connections between different convolutional layers, achieving a favorable connectivity pattern for capturing abstract features more accurately. Finally, a novel MFA module is used to replace GAP, so as to preserve more useful information while weighting the recognition features, thereby enhancing the discriminative and learning abilities of the model. In the conducted experiments, LSCNet achieves the best recognition results on three well-known remote sensing aerial datasets when compared to the state-of-the-art methods (including ViTs-based methods).
- Relation:
  hal-04473702; https://hal.science/hal-04473702; https://hal.science/hal-04473702/document; https://hal.science/hal-04473702/file/LSCNet_final_proof.pdf
- الرقم المعرف:
  10.1109/TGRS.2023.3333401
- الدخول الالكتروني :
  https://hal.science/hal-04473702
  https://hal.science/hal-04473702/document
  https://hal.science/hal-04473702/file/LSCNet_final_proof.pdf
  https://doi.org/10.1109/TGRS.2023.3333401
- Rights:
  info:eu-repo/semantics/OpenAccess
- الرقم المعرف:
  edsbas.203D7C0F

تعليقات

No Comments.

Large Kernel Sparse ConvNet Weighted by Multi-Frequency Attention for Remote Sensing Scene Understanding

اتصل بنا

اتبع