Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning Paper β’ 2412.11974 β’ Published 10 days ago β’ 8
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning Paper β’ 2412.11974 β’ Published 10 days ago β’ 8
M-Longdoc: A Benchmark For Multimodal Super-Long Document Understanding And A Retrieval-Aware Tuning Framework Paper β’ 2411.06176 β’ Published Nov 9 β’ 44
Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse Paper β’ 2409.11242 β’ Published Sep 17 β’ 5 β’ 2
Language Model Unalignment: Parametric Red-Teaming to Expose Hidden Harms and Biases Paper β’ 2310.14303 β’ Published Oct 22, 2023 β’ 1
Video2Music: Suitable Music Generation from Videos using an Affective Multimodal Transformer model Paper β’ 2311.00968 β’ Published Nov 2, 2023
PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns Paper β’ 2403.13315 β’ Published Mar 20
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models Paper β’ 2404.00569 β’ Published Mar 31 β’ 1
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling Paper β’ 2406.11617 β’ Published Jun 17 β’ 8
Ruby Teaming: Improving Quality Diversity Search with Memory for Automated Red Teaming Paper β’ 2406.11654 β’ Published Jun 17 β’ 6
Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique Paper β’ 2408.10701 β’ Published Aug 20 β’ 11