Center for Research in Comptuer Vision
Center for Research in Comptuer Vision

Data Sets

UCF YouTube Action Data Set


  1. It contains 11 action categories: basketball shooting, biking/cycling, diving, golf swinging, horse back riding, soccer juggling, swinging, tennis swinging, trampoline jumping, volleyball spiking, and walking with a dog.
  2. This data set is very challenging due to large variations in camera motion, object appearance and pose, object scale, viewpoint, cluttered background, illumination conditions, etc.
  3. For each category, the videos are grouped into 25 groups with more than 4 action clips in it. The video clips in the same group share some common features, such as the same actor, similar background, similar viewpoint, and so on.
  4. The videos are ms mpeg4 format. You need to install the right Codec (e.g. K-lite Codec Pack contains a cellection of Codecs) to access them.
  5. If you happen to use this data set, you can refer the following paper:
    J. Liu, J. Luo and M. Shah, Recognizing realistic actions from videos "in the wild", CVPR 2009, Miami, FL. (For action biking and walking class, we select all the videos; for the rest of action classes, we only select the videos numbered from 01 to 04 from each group).


  1. YouTube Action Data Set [about 424M]
  2. UCF11* (updated on October 31, 2011)
*Note: "YouTube Action Data Set" is currently called "UCF11". In the updated UCF11 all the videos are converted to 29.97 fps (mpg) and annotations are done accordingly. In the previous release of the annotations, some files were missing and annotations were bad.

Related Publications

  1. Jingen Liu, Jiebo Luo and Mubarak Shah, Recognizing Realistic Actions from Videos "in the Wild", IEEE International Conference on Computer Vision and Pattern Recognition(CVPR), 2009.
  2. Jingen Liu, Yang Yang and Mubarak Shah, Learning Semantic Visual Vocabularies using Diffusion Distance, IEEE International Conference on Computer Vision and Pattern Recognition(CVPR), 2009.