Publications

COPYRIGHT: The copyright of the following materials belongs to corresponding publishers. They are provided only for research and educational use that does not conflict to the interests of the publishers.

(Selected) Publication by Research Topic

Go to Publications by Year

Multimodal Learning

     

Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
Andong Deng, Zhongpai Gao, Anwesa Choudhuri, Benjamin Planche, Meng Zheng, Bin Wang, Terrence Chen, Chen Chen, Ziyan Wu
arXiv:2411.16932
[Paper]


     

Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
Andong Deng, Tongjia Chen, Shoubin Yu, Taojiannan Yang, Lincoln Spencer, Yapeng Tian, Ajmal Saeed Mian, Mohit Bansal, Chen Chen
arXiv:2411.09921
[Paper] [Project Website] [Dataset]


     

3D Vision-Language Gaussian Splatting
Qucheng Peng, Benjamin Planche, Zhongpai Gao, Meng Zheng, Anwesa Choudhuri, Terrence Chen, Chen Chen, Ziyan Wu
arXiv:2410.07577
[Paper]


     

Mixture of Experts Made Personalized: Federated Prompt Learning for Vision-Language Models
Jun Luo, Chen Chen, Shandong Wu
arXiv:2410.10114
[Paper]


     

A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks
Kai Zhang, Rong Zhou, Eashan Adhikarla, Zhiling Yan, Yixin Liu, Jun Yu, Zhengliang Liu, Xun Chen,
Brian Davison, Hui Ren, Jing Huang, Chen Chen, Yuyin Zhou, Sunyang Fu, Wei Liu, Tianming Liu,
Xiang Li, Yong Chen, Lifang He, James Zou, Quanzheng Li, Hongfang Liu, Lichao Sun
Nature Medicine, 2024     (Impact Factor: 58.7)
[Paper] [BiomedGPT Model and Code]


     

Multi-Modality Co-Learning for Efficient Skeleton-based Action Recognition
Jinfu Liu, Chen Chen, Mengyuan Liu
ACM Multimedia (ACM MM), 2024
[Paper] [Code]


     

Towards Multi-modal Transformers in Federated Learning
Guangyu Sun, Matias Mendieta, Aritra Dutta, Xin Li, and Chen Chen
European Conference on Computer Vision (ECCV), 2024
[Paper] [Code]


     

Towards Surveillance Video-and-Language Understanding: New Dataset, Baselines, and Challenges
Tongtong Yuan, Xuange Zhang, Kun Liu, Bo Liu , Chen Chen, Jian Jin, Zhenzhen Jiao
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Paper] [Project Website] [Dataset & Code]


     

OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
Tongjia Chen, Hongshan Yu, Zhengeng Yang, Zechuan Li, Wei Sun, Chen Chen
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Paper] [Project Website] [Code]


     

Towards Geospatial Foundation Models via Continual Pretraining
Matias Mendieta, Boran Han, Xingjian Shi, Yi Zhu, Chen Chen, Mu Li
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
[Paper] [Code]