OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding? Paper • 2501.05510 • Published Jan 9 • 39
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion Paper • 2412.10437 • Published Dec 11, 2024 • 4
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published Dec 12, 2024 • 94
IPDreamer: Appearance-Controllable 3D Object Generation with Image Prompts Paper • 2310.05375 • Published Oct 9, 2023 • 3
Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal Paper • 2404.17808 • Published Apr 27, 2024
MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts Paper • 2407.09816 • Published Jul 13, 2024 • 1
Efficient Region-Aware Neural Radiance Fields for High-Fidelity Talking Portrait Synthesis Paper • 2307.09323 • Published Jul 18, 2023
TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting Paper • 2404.15264 • Published Apr 23, 2024
FuzzCoder: Byte-level Fuzzing Test via Large Language Model Paper • 2409.01944 • Published Sep 3, 2024 • 45
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering Paper • 2408.09174 • Published Aug 17, 2024 • 52