Multiscale human activity recognition and anticipation network

Date published

2022-05-27

Free to read from

Supervisor/s

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Department

Type

Article

ISSN

2162-237X

Format

Citation

Xing Y, Golodetz S, Everitt A, et al., (2024) Multiscale human activity recognition and anticipation network. IEEE Transactions on Neural Networks and Learning Systems, Volume 35, Issue 1, January 2024, pp. 451-465

Abstract

Deep convolutional neural networks have been leveraged to achieve huge improvements in video understanding and human activity recognition performance in the past decade. However, most existing methods focus on activities that have similar time scales, leaving the task of action recognition on multiscale human behaviors less explored. In this study, a two-stream multiscale human activity recognition and anticipation (MS-HARA) network is proposed, which is jointly optimized using a multitask learning method. The MS-HARA network fuses the two streams of the network using an efficient temporal-channel attention (TCA)-based fusion approach to improve the model's representational ability for both temporal and spatial features. We investigate the multiscale human activities from two basic categories, namely, midterm activities and long-term activities. The network is designed to function as part of a real-time processing framework to support interaction and mutual understanding between humans and intelligent machines. It achieves state-of-the-art results on several datasets for different tasks and different application domains. The midterm and long-term action recognition and anticipation performance, as well as the network fusion, are extensively tested to show the efficiency of the proposed network. The results show that the MS-HARA network can easily be extended to different application domains.

Description

Software Description

Software Language

Github

Keywords

Activity recognition and anticipation, multiscale behavior modeling, multitask learning, two-stream network fusion

DOI

Rights

Attribution-NonCommercial 4.0 International

Relationships

Relationships

Supplements

Funder/s