Publication:
Edge devices friendly multi-human parsing with lightweight encoding and multi-scale self-attention based decoding

No Thumbnail Available

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Springer

Research Projects

Organizational Units

Journal Issue

Abstract

Multi-human parsing has received considerable research attention in recent years. Deep learning-based Multi-human parsing methods demonstrated promising results. In reality, most methods suffer while running on edge devices due to their extensive network architecture and low inference speed. Moreover, the inadequacies in modeling long-range feature dependencies have led to suboptimal representations of discriminative features across semantic classes. To address these challenges and facilitate real-time implementation on edge devices, we design a deep yet lightweight Encoder and a Multi-Scale Self-Attention based Decoder to capture long-range dependencies and spatial relationships. Furthermore, we have optimized our model through half-precision quantization, enhancing efficiency for edge devices. Experiments on publicly available Crowd Instance-level Human Parsing (CIHP) and Look into Person (LIP) datasets show the efficacy of our framework to parse multi-human with high inference speed at 55.6 FPS. Additionally, real-world testing on Jetson Nano edge devices showcases competitive performance. An extensive ablation study on different modules validates our network. © 2025 Elsevier B.V., All rights reserved.

Description

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By