ASBAR: an Animal Skeleton-Based Action Recognition framework. Recognizing great ape behaviors in the wild using pose estimation
Abstract
The study and classification of animal behaviors have traditionally relied on direct human observation or video analysis, processes that are labor-intensive, time-consuming, and prone to human bias. Advances in machine learning for computer vision, particularly in pose estimation and action recognition, offer transformative potential to enhance the understanding of animal behaviors. However, the integration of these technologies for behavior recognition remains underexplored, particularly in natural settings. We introduce ASBAR (Animal Skeleton-Based Action Recognition), a novel framework that integrates pose estimation and behavior recognition into a cohesive pipeline. To demonstrate its utility, we tackled the challenging task of classifying natural behaviors of great apes in the wild. Our approach leverages the OpenMonkeyChallenge dataset, one of the largest open-source primate pose datasets, to train a robust pose estimation model using DeepLabCut. Subsequently, we extracted skeletal motion data from the PanAf500 dataset, a collection of in-the-wild videos of gorillas and chimpanzees annotated with nine behavior categories. Using PoseConv3D from MMAction2, we trained a skeleton-based action recognition model, achieving a Top-1 accuracy of 75.3%. This performance is comparable to previous video-based methods while reducing input data size by approximately 20-fold, offering significant advantages in computational efficiency and storage. To support further research, we provide an open-source, terminal-based GUI for training and evaluation, along with a dataset of 5,440 annotated keypoints for replication and extension to other species and behaviors. All models, code, and data are publicly available at: https://github.com/MitchFuchs/asbar.
Related articles
Related articles are currently not available for this article.