A refined non-driving activity classification using a two-stream convolutional neural network

Citation

Yang L, Yang T, Liu H, et al., (2021) A refined non-driving activity classification using a two-stream convolutional neural network. IEEE Sensors Journal, Volume 21, Number 14, July 2021, pp. 15574-15583

Abstract

It is of great importance to monitor the driver’s status to achieve an intelligent and safe take-over transition in the level 3 automated driving vehicle. We present a camera-based system to recognise the non-driving activities (NDAs) which may lead to different cognitive capabilities for take-over based on a fusion of spatial and temporal information. The region of interest (ROI) is automatically selected based on the extracted masks of the driver and the object/device interacting with. Then, the RGB image of the ROI (the spatial stream) and its associated current and historical optical flow frames (the temporal stream) are fed into a two-stream convolutional neural network (CNN) for the classification of NDAs. Such an approach is able to identify not only the object/device but also the interaction mode between the object and the driver, which enables a refined NDA classification. In this paper, we evaluated the performance of classifying 10 NDAs with two types of devices (tablet and phone) and 5 types of tasks (emailing, reading, watching videos, web-browsing and gaming) for 10 participants. Results show that the proposed system improves the averaged classification accuracy from 61.0% when using a single spatial stream to 90.5%

Description

Software Description

Software Language

Github

Keywords

2-stream CNN, optical flow, Level 3 automation, NDA classification

DOI

Rights

Attribution-NonCommercial 4.0 International

Relationships

Relationships

Supplements

Funder/s