Publications

You can also find my papers on my Google Scholar profile.
You can also find my papers on my Semantic Scholar profile.

LSTP: Language-guided Spatial-Temporal Prompt Learning for Long-form Video-Text Understanding

Yuxuan Wang, Yueqian Wang, Pengfei Wu, Jianxin Liang, Dongyan Zhao, Zilong Zheng

Published in arXiv:2402.16050, 2024

Intergrating optical flow for relevant content selection to improve video-text LLMs’ abilities on videoqa.

Download here

HawkEye: Training Video-Text LLMs for Grounding Text in Videos

Yueqian Wang, Xiaojun Meng, Jianxin Liang, Yuxuan Wang, Qun Liu, Dongyan Zhao

Published in arXiv:2403.10228, 2024

One of the first video-text LLMs that can perform temporal video grounding in a fully text-to-text manner, and InternVid-G, a large-scale video-text dataset for video grounding training.

Download here

STAIR: Spatial-Temporal Reasoning with Auditable Intermediate Results for Video Question Answering

Yueqian Wang, Yuxuan Wang, Kai Chen, Dongyan Zhao

Published in AAAI Conference on Artificial Intelligence, 2024

A neural module network (NMN) based method for videoqa with long videos and complicated questions.

Download here

Overview of the NLPCC 2023 Shared Task 10: Learn to Watch TV: Multimodal Dialogue Understanding and Response Generation

Yueqian Wang, Yuxuan Wang, Dongyan Zhao

Published in Natural Language Processing and Chinese Computing, 2023

Hosted a shared task about video dialogue understanding and prediction at NLPCC 2023.

Download here

VSTAR: A Video-grounded Dialogue Dataset for Situated Semantic Understanding with Scene and Topic Transitions

Yuxuan Wang, Zilong Zheng, Xueliang Zhao, Jinpeng Li, Yueqian Wang, Dongyan Zhao

Published in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

A large-scale video dialogue corpus collected from TV series with scene and segment transistion annotation.

Download here

SMASH: Improving SMAll Language Models’ Few-SHot Ability with Prompt-Based Distillation

Yueqian Wang, Chang Liu, Kai Chen, Xi Wang, Dongyan Zhao

Published in Findings of the Association for Computational Linguistics: EMNLP, 2022

Prompt-based learning and distillation for small transformer encoder-based language models.

Download here