This repository contains the model of the paper ViSpeak: Visual Instruction Feedback in Streaming Videos.
Code: https://github.com/HumanMLLM/ViSpeak