REU 2023 – Participants
I am a rising senior completing my Bachelor's in Computer Science and Minor in Physics at Florida State University. The University of Central Florida's REU program in Computer Vision was a wonderful opportunity to explore Computer Vision through a research-oriented lens. This summer, I worked with Dr. Tanvir Ahmed on Post-Disaster Building Damage Classification from Aerial-View Images. Existing models utilize MobileNet, a lightweight architecture that uses depth-wise separable convolutions to classify building damages based on a masked image. After attempting to replicate these results through Transfer Learning on the dataset DoriaNet, which assessed building damages from Hurricane Dorian, we experimented with different architectures to improve the performances of existing models, which include but are not limited to ResNet50, Vision Transformer B-32, and Swin Transformer. We also investigated different loss functions and segmentation methods (such as state-of-the-art Mask2Former) to further improve performance. Further information about my project may be found below in the weekly presentations and report provided. If you have any questions, you can email me at email@example.com or Dr. Tanvir Ahmed at Tanvir.Ahmed@ucf.edu
I am a rising senior at the University of Central Florida majoring in Computer Science. I had already been introduced to Computer Vision from working on projects with AI@UCF. The REU program helped me solidify my skills in CV and gave me an extensive introduction to academic research. Over the summer, I studied the robustness of Visual Large Language Models with Ser-Nam Lim and Young Kyun Jang. We evaluated a VLLM (LLaVA) under the effects of data augmentation and adversarial attacks. To view my progress please refer to the weekly presentations below. If you have any further questions, please feel free to reach out to me at firstname.lastname@example.org.
Born and raised in Central New Jersey, Akhil is a rising sophomore and Robert W. Woodruff Scholar at Emory University. He is a Computer Science major and Finance minor at Emory, having completed coursework in Data Structures & Algorithms, Mathematical Foundations of Computer Science, and Business Data & Decision Analytics thus far. Prior to the REU, Akhil had experience with CS fundamentals (Object-Oriented Programming, Data Structures, Algorithmic thinking) and Web Development but this research experience has expanded his knowledge greatly on the Computer Vision and Deep Learning aspects of the field. This summer, Akhil worked with Dr. Gaurav Kumar Nayak, Parth Parag Kulkarni, and Dr. Mubarak Shah on a project in Robust Image Geolocalization. Existing research has explored the usage of Transformers for Cross-View Image Geolocalization with the TransGeo model on various datasets. This summer, Akhil investigated how training the TransGeo model on various noisy and clean data combinations improves the Retrieval-Based model's performance and makes it more robust. The next steps for the project involve developing and implementing a new model that can identify the type of noise added to a query image (if noise is present), denoise the image (if necessary), and evaluate both the reference and generated query image on the original pre-trained TransGeo model. For more details on his project and progress, please refer to his weekly presentations, report, and poster below. Please feel free to contact him at email@example.com or at firstname.lastname@example.org.
Hey queen (or king)! My name is Philomina Ekezie and I am a rising junior at Mercer University in Georgia where I am currently pursuing a B.S. degree in Computer Science. Prior to this internship, I had built a good understanding of CS fundamentals such as object oriented programming, data and file structures, and computer algorithms through classes like Programming 1-2 and Complex Data Structures and Algorithms. I also gained experience in other relevant topics in the world of CS such as Database Systems and Computer Assembly. However, I did not have any experience related to Artificial Intelligence or Machine Learning—just a lot of curiosity and an eagerness to learn more about the fields! Lucky for me, this internship gave me extremely valuable experience within both domains; I learned about deep learning fundamentals and had hands-on practice with machine learning tactics. During the summer, I worked on Simultaneous Classification of Subjects and Actions within Video Sequences, where our goal was to train a model to both recognize a subject and detect the action the subject was performing given a video (a task that hasn’t been done before). I created a novel dataset that specifically catered to the complex nature of our model; this model featured 3,000 videos, 10 subject classes, and 6 action classes, as well as per-frame annotations for subject localization. My time spent in the REU gifted me with lasting personal, social, and academic skills that I am eager to take with me as I continue to build my career and chase my dreams. If you’re interested in my work, please look at all of my previous presentations (they’re really pretty), glance through my paper, or just look at my poster (it’s also pretty). For any questions, email me (11026268@live.Mercer.edu) or my mentor (email@example.com). Bye queen and/or king <33
I'm a rising senior at the University of Central Florida, majoring in Mechanical Engineering. I had very little Computer Vision knowledge prior to the REU, but I gained much more knowledge working alongside my mentor Jyoti Kini. My interest in Computer Vision stemmed from its applications in robotics, as machine learning is advancing the field with the creation of new cutting edge products. My project this summer was to create a multimodal action recognition model that identifies human actions through the Meccano dataset which is comprised of RGB, Depth, and Gaze egocentric-view input data. We worked to improve the base SlowFast convolutional model provided by the ICIAP for an action recognition competition, and instead went with a transformer-based approach. If you have any questions related to this work, feel free to email me at firstname.lastname@example.org or email@example.com
I am a rising junior majoring in Computer Science here at the University of Central Florida. I had some prior experience with Artificial Intelligence prior to the REU though the course Algorithms for Machine Learning and AI@UCF, and some small experience with Computer Vision. However, this REU gave me an incredible opportunity by rapidly broadening my knowledge of more low-level CV and AI concepts such as Convolutional Neural Networks, Transformers, and Diffusion. Over the summer, I worked with Dr. Chen Chen and Umar Khalid on Video Content Generation using Diffusion models. Because novel Text-to-Video (T2V) models require large video datasets to train and are as of now inefficient for mass-deployment, we explored modifying Text-to-Image (T2I) models such as Stable Diffusion to utilize cross-frame attention for zero-shot novel video generation and guided video editing. Our approach involved utilizing state-of-the-art attention processors for greater video generation efficiency, testing different Stable Diffusion models for the greatest frame and textual prompt consistency, and began exploring methods for utilizing optical flow to assist in frame consistency and generating realistic videos. If you have any questions or would like to discuss my research more, you can contact me at firstname.lastname@example.org.
I am a senior at Ana G. Mendez University (Gurabo Campus), majoring in computer engineering. While I’ve had experience in some computer science courses and topics, this summer has been my first introduction to the fields of computer vision and machine learning. Thus, this REU has significantly increased my understanding of computer vision by offering a short course on its fundamentals and giving me the opportunity to work with assistant professor Dr. Mengjie Li on a graduate-level research project. With the assistance of my mentor, I worked to compare deep learning-based algorithms for the segmentation of photovoltaic (PV) modules from satellite and aerial imagery of different types of PVs. This included the use of a recent state-of-the-art universal segmentation model named Mask2Former. For information about my progress, check out my weekly presentations below and contact me at email@example.com.
My name is Nathan Labiosa am a rising junior at the University of Wisconsin - Madison majoring in Biomedical Engineering and Computer Science. I’ve been exposed to computer vision in previous classes, including topics in neural networks, linear algebra in machine learning, and general AI principles. I worked with Dr. Ser Nam Lim on investigating the impact of visual information on large language models. We worked with a LLM’s like LLaMA65B and a visual large language model called LLaVA. Through our work with LLaVA, we established that images truly enhance a LLM’s capabilities. We also benchmarked LLaMA65B and showed that that it possesses strong instructional capabilities that visual-LLM lack. If you have any questions or would like to discuss my research more, please contact me at firstname.lastname@example.org or email@example.com.
I am a rising senior at the University of Central Florida, majoring in Computer Science for my bachelor’s degree. As someone interested in familiarizing myself with recent computer science research, the REU program provided an excellent opportunity for learning and gaining invaluable experience in the computer vision field. Before the REU, I had limited computer vision knowledge, but, through the program's initial workshops and working with my mentors Gaurav Nayak and Dr. Shah, I gained a deeper understanding of the field. Throughout this past summer, I worked on a project to develop a novel algorithm that can perform dataset condensation on videos. The goal of the project is to obtain a small synthetic dataset that can produce similar video classification accuracy in comparison to the results obtained from the model trained on full dataset. Our approach involved leveraging existing video classification-based backbone architectures to implement coreset techniques (e.g., random) as a baseline followed by generating small synthetic datasets through distribution matching. I validated the efficacy of our approach on a ResNet-18 3D architecture with the UCF-101 dataset to solve video dataset condensation. If you have any questions, you can email me at firstname.lastname@example.org
I am a rising junior at Case Western Reserve University majoring in Data Science and Analytics and minoring in Economics. I have done some computer science research prior to this program, but this was my first exposure to computer vision and an amazing learning opportunity. I worked with Dr. Ser Nam Lim on evaluating and improving the performance of a visual large language model called LLaVA with zero-shot classification. Through experimentation with different prompting methods, inference styles, and model temperatures, I successfully raised the classification accuracy by an average of 10 percentage points across all datasets used. Please feel free to check out my weekly presentations, poster, and report, and reach out to me at email@example.com with any questions.
*All times are Eastern Standard Time (EST)