Skip to main content

Final Oral Examination for Doctor of Philosophy (Computer Science)

Kevin Duarte

Thursday, November 18, 2021
3:00PM – 5:00PM
Zoom Meeting
[Bifold] [Thesis] [Video]

Dissertation

With the ever-increasing number of videos being uploaded to social media and video sharing websites, it become vitally important to process and understand the video content. Due to the sheer amount of data, it is extremely impractical for human-based analysis of these videos. Therefore, it is imperative to develop robust automated systems to analyze and categorize these videos. This makes it a very exciting time to be studying Computer Vision and Machine Learning, which attempts to solve the general problem of learning from visual data. Specifically, the field of video understanding attempts to solve various video-related problems like video retrieval, tracking, action classification, and video object segmentation. Solving such problems can lead to improvements in many real-world applications like self-driving cars, security, and medical imaging.

This dissertation makes several contributions in the field of video understanding including video object segmentation, video action detection, temporal action localization and text-to-video retrieval, by proposing a generalization of capsule networks to the video domain allowing for improved representation learning in videos.