Skip to main content

CAP6412 – Spring 2025

Advanced Computer Vision (3 Credit Hours)

Course Content

This is an Advanced Computer Vision course which will expose graduate students to the cutting-edge research in Computer Vision. We will focus on generative models: Visual-Language Models and Diffusion Models, which have revolutionized computer vision research. Chat-GPT, a generative model, has been a big success and is being used by millions of people every day. It is a language model, where the input is a text and output is a text. On the other hand, in visual-language generative models the input and output can be an image, video, audio etc. Another class of generative models Diffusion models have revolutionized the generation of realistic images and videos. Both visual language and diffusion models are being widely used in daily life and will be impacting our society. Currently, an important research topic is to study the bias, privacy, security, robustness, toxicity, legality, and truthfulness of these models.
The first part of this course will deal with the basics and foundational concepts related to visual-language and diffusion models. I will cover this part through my lectures. The second part related to topics of bias, privacy, toxicity etc. will be covered through discussion of recent research papers. In each class we will discuss one research paper, which will be presented by a group of students. Students will read the papers before the class meeting and write a report. Students will also work on assignments and work on group projects.

Grading Policy

  • Assignments: 45%
  • Presentation: 10%
  • Reports: 10%
  • Projects: 35%

Late Policy

  • 0 for late Reports/Presentations
  • 20% off per day, up to 4 days, for Assignments/Projects

Student Learning Outcomes

After the completion of the course

  • Students should be knowledgeable in Visual-language and Diffusion models
  • Students should be able to:
    • Read and understand a research paper.
    • Write a comprehensive review of the paper.
    • To identify strong and weak points of the papers.
    • To generate own ideas to solve the same problem.
    • To work on research project and write a research paper
  •  Review/rehearsal of power point presentation meeting:
    • For Monday presentation
      • Slide Review: Wednesday  4:15 a week before the scheduled presentation
      • Rehearsal: Friday a week before the scheduled presentation  1:00PM during Office hours
    • For Wednesday presentation
      • Slide Review : A week before the scheduled presentation :  Friday  1:00PM during Office hours
      •  Rehearsal: A week of presentation on Monday 2:00PM during Office hours

Important Dates:

See https://www.crcv.ucf.edu/courses/cap6412-spring-2025/schedule/

Statement on Academic Integrity:

The UCF Golden Rule will be observed in the class. Plagiarism and Cheating of any kind on an examination, quiz, or assignment will result at least in an “F” for that assignment (and may, depending on the severity of the case, lead to an “F” for the entire course) and may be subject to appropriate referral to the Office of Student Conduct for further action. I will assume for this course that you will adhere to the academic creed of this University and will maintain the highest standards of academic integrity. In other words, don’t cheat by giving answers to others or taking them from anyone else. I will also adhere to the highest standards of academic integrity, so please do not ask me to change (or expect me to change) your grade illegitimately or to bend or break rules for one person that will not apply to everyone.