Experiments on human activity recognition from video based on deep learning. This repository provides implementations of various deep learning models for video processing. In particular, we also provide the code for our Convolutional-Attentional 3D (CA3D) model (based on the CAST - Convolutional-Attentional Spatio Temporal block), and the Spatio-temporal Chi-stream Network (SChi-Net) model (based on the Chi-Stream block). Code for training and evaluating the models on various datasets is available. These are lightweight architectures, designed for computational efficiency, and suitable for edge applications and constrained or consumer hardware.
Launch experiment with:
python runexp.py --config <config> --mode <train|test|traintest> --device <device> --restart
Where:
<config>
is the name of a configuration dictionary, with dotted notation, defined anywhere in your code. For exampleconfigs.base.config_base
.<mode>
can be one oftrain
,test
,traintest
, depending if you want to perform model training, testing, or both.<device>
can becpu
,cuda:0
, or any device you wish to use for the experiment.- The flag
--restart
is optional. If you remove it, you can resume a previously suspended experiment from a checkpoint, if available.
- Python 3.10
- PyTorch 2.0.1
Research supported by project INAROS (INtelligenza ARtificiale per il mOnitoraggio e Supporto agli anziani), aimed at the development of deep learning technologies for video processing for elderly assistance in smart home and smart healthcare applications.
Gabriele Lagani: gabriele.lagani@isti.cnr.it