Skip to main content

CRCV team ‘knights’ got first place at Visual Inductive Priors for Data-Efficient Action Recognition challenge, ICCV 2021. The team includes graduate students Ishan Dave, Brandon Clark, Rohit Gupta, and undergraduate intern Naman Biyani from the Indian Institute of Technology Kanpur.

Data is fueling deep learning, yet it is costly to gather and to annotate. Training on massive datasets has a huge energy consumption adding to our carbon footprint. In addition, there are only a select few deep learning behemoths which have billions of data points and thousands of expensive deep learning hardware GPUs at their disposal. This challenge focuses on how to pre-wire deep networks with generic visual inductive innate knowledge structures, which allows incorporating hard-won existing generic knowledge. Visual inductive priors are data-efficient: what is built-in no longer has to be learned, saving valuable training data.

Our proposed solution achieves 73% on Kinetics400ViPriors test set, which is the best among all the other entries. The approach has 3 main components: state-of-the-art TCLR self-supervised pretraining, video transformer models, and optical flow modality.