Skip to main content

UCF50 – Action Recognition Data Set



Kishore K. Reddy, and Mubarak Shah, Recognizing 50 Human Action Categories of Web Videos, Machine Vision and Applications Journal (MVAP), September 2012.


UCF50 is an action recognition data set of realistic action videos, collected from YouTube, having 50 action categories. This data set is an extension of YouTube Action data set (UCF11) which has 11 action categories.

Most of the available action recognition data sets are not realistic and are staged by actors. In our data set, the primary focus is to provide the computer vision community with an action recognition data set consisting of realistic videos which are taken from youtube. Our data set is very challenging due to large variations in camera motion, object appearance and pose, object scale, viewpoint, cluttered background, illumination conditions, etc. For all the 50 categories, the videos are grouped into 25 groups, where each group consists of more than 4 action clips. The video clips in the same group may share some common features, such as the same person, similar background, similar viewpoint, and so on.

Data Set Details

UCF50 data set’s 50 action categories collected from youtube are: Baseball Pitch, Basketball Shooting, Bench Press, Biking, Biking, Billiards Shot,Breaststroke, Clean and Jerk, Diving, Drumming, Fencing, Golf Swing, Playing Guitar, High Jump, Horse Race, Horse Riding, Hula Hoop, Javelin Throw, Juggling Balls, Jump Rope, Jumping Jack, Kayaking, Lunges, Military Parade, Mixing Batter, Nun chucks, Playing Piano, Pizza Tossing, Pole Vault, Pommel Horse, Pull Ups, Punch, Push Ups, Rock Climbing Indoor, Rope Climbing, Rowing, Salsa Spins, Skate Boarding, Skiing, Skijet, Soccer Juggling, Swing, Playing Tabla, TaiChi, Tennis Swing, Trampoline Jumping, Playing Violin, Volleyball Spiking, Walking with a dog, and Yo Yo.

Results (updated September 12, 2012)

If you happen to use UCF50, send us an email with the following details and we will update our web-page with your results.

  • Performance (%)
  • Experimental Setup (In order to keep the reported results consistent, please follow “Leave One Group Out Cross Validation” which will lead to 25 cross-validations. This would eliminate randomness in the experimental setup. Please note that some action categories might have more than 25 groups. In this experimental setup, please consider only the first 25 groups in each action category)
  • Paper details

Note: It is very important to keep the videos belonging to the same group seperate in training and testing. Since the videos in a group are obtained from single long video, sharing videos from same group in training and testing sets would give high performance.