Center for Research in Comptuer Vision
Center for Research in Comptuer Vision



Data Sets

UCF Sports Action Data Set



UCF Sports dataset consists of a set of actions collected from various sports which are typically featured on broadcast television channels such as the BBC and ESPN. The video sequences were obtained from a wide range of stock footage websites including BBC Motion gallery and GettyImages.

The dataset includes a total of 150 sequences with the resolution of 720 x 480. The collection represents a natural pool of actions featured in a wide range of scenes and viewpoints. By releasing the data set we hope to encourage further research into this class of action recognition in unconstrained environments. Since its introduction, the dataset has been used for numerous applications such as: action recognition, action localization, and saliency detection.



Dataset Actions


The dataset includes the following 10 actions. The figure above shows the a sample frame of all ten actions, along with their bounding box annotations of the humans shown in yellow.

Diving (14 videos)
Golf Swing (18 videos)
Kicking (20 videos)
Lifting (6 videos)
Riding Horse (12 videos)
Running (13 videos)
SkateBoarding (12 videos)
Swing-Bench (20 videos)
Swing-Side (13 videos)
Walking (22 videos)


Dataset Summary


The following table summarizes the characteristics of the dataset.


Figure: Summary of the characteristics of UCF Sports.


Statistics


The following figure shows the distribution of the number of clips per action as the number of clips in each class is not the same.


Figure: Number of clips per action class.


The following figure illustrates the total duration of clips (blue) and the average clip length (green) for every action class. It is evident that certain actions are short in nature, such as kicking, as compared to walking or running, which are relatively longer and have more periodicity. However, it is apparent from the chart that the average duration of action clips shows great similarities across different classes. Therefore, merely considering the duration of one clip would not be enough for identifying the action.


Figure: The total time of video clips for each action class is shown in blue. Average length of clips for each action is shown in green.


Recommended Experimental Setup



Download


The data set can be downloaded by clicking here.

Human gaze annotations can be downloaded by clicking here.

Train/Test splits for Action localization can be downloaded by clicking here.


Related Publications


If you use this data set, please cite the following papers: