Relational Prototypical Network for Weakly Supervised Temporal Action Localization
Published in Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020
Recommended citation: Linjiang Huang, Yan Huang, Wanli Ouyang, Liang Wang. "Relational Prototypical Network for Weakly Supervised Temporal Action Localization".Proceedings of the AAAI Conference on Artificial Intelligence (AAAI) AAAI 2020 (Oral).
Abstract
In this paper, we propose a weakly supervised temporal action localization method on untrimmed videos based on prototypical networks. We observe two challenges posed by weakly supervision, namely action-background separation and action relation construction. Unlike the previous method, we propose to achieve action-background separation only by the original videos. To achieve this, a clustering loss is adopted to separate actions from backgrounds and learn intra-compact features, which helps in detecting complete action instances. Besides, a similarity weighting module is devised to further separate actions from backgrounds. To effectively identify actions, we propose to construct relations among actions for prototype learning. A GCN-based prototype embedding module is introduced to generate relational prototypes. Experiments on THUMOS14 and ActivityNet1. 2 datasets show that our method outperforms the state-of-the-art methods.