hysts HF staff commited on
Commit
2880539
1 Parent(s): 759a7b0

commit files to HF hub

Browse files
Files changed (1) hide show
  1. papers.csv +12 -12
papers.csv CHANGED
@@ -575,7 +575,7 @@ FeatEnHancer: Enhancing Hierarchical Features for Object Detection and Beyond Un
575
  DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds,"Ma, Tao*; Yang, Xuemeng; Zhou, Hongbin; Li, Xin; Shi, Botian; Liu, Junjie; Yang, Yuchen; Liu, Zhizheng; He, Liang; Li, Hongsheng; Li, Yikang; Qiao, Yu",poster,2306.06023,https://arxiv.org/abs/2306.06023,,https://huggingface.co/papers/2306.06023,,,,12,0
576
  DETRs with Collaborative Hybrid Assignments Training,"Zong, Zhuofan*; Song, Guanglu; Liu, Yu",poster,2211.12860,https://arxiv.org/abs/2211.12860,https://github.com/Sense-X/Co-DETR,https://huggingface.co/papers/2211.12860,,,,3,0
577
  Open Vocabulary Object Detection With an Open Corpus,"Wang, Jiong*; zhang, huiming; Hong, Haiwen; Jin, Xuan; He, Yuan; xue, hui; Zhao, Zhou",poster,,,,,,,,,
578
- SparseDet: Improving Sparsely Annotated Object Detection with Pseudo-positive Mining,"Suri, Saksham*; Rambhatla, Sai Saketh ; Chellappa, Rama; Shrivastava, Abhinav",poster,2201.04620,https://arxiv.org/abs/2201.04620,,https://huggingface.co/papers/2201.04620,,,,4,2
579
  Unsupervised Anomaly Detection with Diffusion Probabilistic Model,"Zhang, Xinyi*; Li, Naiqi; Li, Jiawei; Dai, Tao; Jiang, Yong; Xia, Shu-Tao",poster,,,,,,,,,
580
  UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation,"Wang, Haiyang*; Tang, Hao; Shi, Shaoshuai; Li, Aoxue; Li, Zhenguo; Schiele, Bernt; Wang, Liwei",poster,,,,,,,,,
581
  Focus the Discrepancy: Intra- and Inter-Correlation Learning for Image Anomaly Detection,"Yao, Xincheng*; Li, Ruoqi; Qian, Zefeng; Luo, Yan; Zhang, Chongyang",poster,,,,,,,,,
@@ -592,7 +592,7 @@ Delving into Motion-Aware Matching for Monocular 3D Object Tracking,"Huang, Kuan
592
  FB-BEV: BEV Representation from Forward-Backward View Transformations,"Li, Zhiqi*; Yu, Zhiding; Wang, Wenhai; Anandkumar, Animashree; Lu, Tong; Alvarez, Jose M",poster,,,,,,,,,
593
  Learning from Noisy Data for Semi-Supervised 3D Object Detection,"Chen, Zehui; Li, Zhenyu; Wang, Shuo; Fu, Dengpan; Zhao, Feng*",poster,,,,,,,,,
594
  Boosting Long-tailed Object Detection via Step-wise Learning on Smooth-tail Data,"Dong, Na*; Zhang, Yongqiang; Ding, Mingli; Lee, Gim Hee",poster,2305.12833,https://arxiv.org/abs/2305.12833,,https://huggingface.co/papers/2305.12833,,,,4,0
595
- Objects do not disappear: Video object detection by single-frame object location anticipation,"Liu, Xin*; Karimi Nejadasl, Fatemeh; van Gemert, Jan C; Booij, Olaf; Pintea, Silvia L",poster,2308.04770,https://arxiv.org/abs/2308.04770,https://github.com/L-KID/Videoobject-detection-by-location-anticipation,https://huggingface.co/papers/2308.04770,,,,5,0
596
  Unified Visual Relationship Detection with Vision and Language Models,"Zhao, Long*; Yuan, Liangzhe; Gong, Boqing; Cui, Yin; Schroff, Florian; Yang, Ming-Hsuan; Adam, Hartwig; Liu, Ting",poster,2303.08998,https://arxiv.org/abs/2303.08998,,https://huggingface.co/papers/2303.08998,,,,8,1
597
  Universal Domain Adaptation via Compressive Attention Matching,"zhu, didi; Li, Yinchuan; Yuan, Junkun; Li, Zexi; Kuang, Kun; Wu, Chao*",poster,2304.11862,https://arxiv.org/abs/2304.11862,,https://huggingface.co/papers/2304.11862,,,,6,0
598
  Unsupervised Domain Adaptive Detection with Network Stability Analysis,"Zhou, Wenzhang; Fan, Heng; Luo, Tiejian; Zhang, Libo*",poster,2308.08182,https://arxiv.org/abs/2308.08182,https://github.com/tiankongzhang/NSA,https://huggingface.co/papers/2308.08182,,,,4,0
@@ -639,7 +639,7 @@ EverLight: Indoor-Outdoor Editable HDR Lighting Estimation,"Karimi Dastjerdi, Mo
639
  Prompt Tuning Inversion for Text-driven Image Editing Using Diffusion Models,"Dong, Wenkai*; Duan, Xiaoyue; Xue, Song; Han, Shumin",poster,2305.04441,https://arxiv.org/abs/2305.04441,,https://huggingface.co/papers/2305.04441,,,,4,0
640
  Efficient Diffusion Training via Min-SNR Weighting Strategy,"Hang, Tiankai; Gu, Shuyang*; Li, Chen; Bao, Jianmin; Chen, Dong; Hu, Han; Geng, Xin; Guo, Baining",poster,2303.09556,https://arxiv.org/abs/2303.09556,https://github.com/TiankaiHang/Min-SNR-Diffusion-Training,https://huggingface.co/papers/2303.09556,,,,8,0
641
  BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion,"Xie, Jinheng; Li, Yuexiang; Huang, Yawen; Liu, Haozhe; Zhang, Wentian; Zheng, Yefeng; Shou, Mike Zheng*",poster,2307.10816,https://arxiv.org/abs/2307.10816,https://github.com/showlab/BoxDiff,https://huggingface.co/papers/2307.10816,,,,7,0
642
- Improving Sample Quality of Diffusion Models Using Self-Attention Guidance,"Hong, Susung*; Lee, Gyuseong; Jang, Wooseok; Kim, Seungryong",poster,2210.00939,https://arxiv.org/abs/2210.00939,,https://huggingface.co/papers/2210.00939,https://github.com/KU-CVLAB/Self-Attention-Guidance,,,4,0
643
  Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation,"WANG, Luozhou*; Yang, Shuai; Liu, Shu; Chen, Yingcong",poster,2307.08448,https://arxiv.org/abs/2307.08448,https://github.com/AndysonYs/Selective-Diffusion-Distillation,https://huggingface.co/papers/2307.08448,,,,4,0
644
  Deep Image Harmonization with Learnable Augmentation,"Niu, Li*; Cao, Junyan; Cong, Wenyan; Zhang, Liqing",poster,2308.00376,https://arxiv.org/abs/2308.00376,https://github.com/bcmi/SycoNet-Adaptive-Image-Harmonization,https://huggingface.co/papers/2308.00376,,,,4,0
645
  Out-of-domain GAN inversion via Invertibility Decomposition for Photo-Realistic Human Face Manipulation,"YANG, Xin*; XU, Xiaogang; Chen, Yingcong",poster,2212.09262,https://arxiv.org/abs/2212.09262,,https://huggingface.co/papers/2212.09262,,,,3,0
@@ -666,7 +666,7 @@ Householder Projector for Unsupervised Latent Semantics Discovery,"Song, Yue*; Z
666
  Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation,"Niu, Li*; Tan, Linfeng; Tao, Xinhao; Cao, Junyan; Guo, Fengjun; Long, Teng; Zhang, Liqing",poster,2308.00356,https://arxiv.org/abs/2308.00356,https://github.com/bcmi/Image-Harmonization-Dataset-ccHarmony,https://huggingface.co/papers/2308.00356,,,,7,0
667
  One-Shot Generative Domain Adaptation,"Yang, Ceyuan*; Shen, Yujun; Zhang, Zhiyi; Xu, Yinghao; Zhu, Jiapeng; Wu, Zhirong; Zhou, Bolei",poster,2111.09876,https://arxiv.org/abs/2111.09876,,https://huggingface.co/papers/2111.09876,,,,7,0
668
  Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time,"Chan, Cheng-Hung; Yuan, Cheng-Yang; Sun, Cheng; Chen, Hwann-Tzong*",poster,,,,,,,,,
669
- "Versatile Diffusion: Text, Images and Variations All in One Diffusion Model","Xu, Xingqian*; Wang, Zhangyang; Zhang, Gong; Wang, Kai; Shi, Humphrey",poster,2211.08332,https://arxiv.org/abs/2211.08332,https://github.com/SHI-Labs/Versatile-Diffusion,https://huggingface.co/papers/2211.08332,,,,5,0
670
  Sound Source Localization is All about Cross-Modal Alignment,"Senocak, Arda*; Ryu, Hyeonggon; Kim, Junsik; Oh, Tae-Hyun; Pfister, Hanspeter; Chung, Joon Son",poster,,,,,,,,,
671
  Class-Incremental Grouping Network for Continual Audio-Visual Learning,"Mo, Shentong; Pian, Weiguo; Tian, Yapeng*",poster,,,,,,,,,
672
  Audio-Visual Class-Incremental Learning,"Pian, Weiguo*; Mo, Shentong; Guo, Yunhui; Tian, Yapeng",poster,2308.11073,https://arxiv.org/abs/2308.11073,https://github.com/weiguoPian/AV-CIL_ICCV2023,https://huggingface.co/papers/2308.11073,,,,4,0
@@ -742,7 +742,7 @@ Sparse Point Guided 3D Lane Detection,"Yao, Chengtang*; Yu, Lidong; Jia, Yunde;
742
  A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection,"Zhang, Dingyuan*; Liang, Dingkang; Zou, Zhikang; Li, Jingyu; Ye, Xiaoqing; Tan, Xiao; Liu, Zhe; Bai, Xiang",poster,,,,,,,,,
743
  Learn TAROT with MENTOR: A Meta-Learned Self-supervised Approach for Trajectory Prediction,"Pourkeshavarz, Mozhgan MP*; Chen, Changhe; Rasouli, Amir",poster,,,,,,,,,
744
  FocalFormer3D : Focusing on Hard Instance for 3D Object Detection,"Chen, Yilun*; Yu, Zhiding; Chen, Yukang; Lan, Shiyi; Anandkumar, Animashree; Jia, Jiaya; Alvarez, Jose M",poster,2308.04556,https://arxiv.org/abs/2308.04556,https://github.com/NVlabs/FocalFormer3D,https://huggingface.co/papers/2308.04556,,,,7,1
745
- Scene as Occupancy,"Tong, Wenwen; Sima, Chonghao*; Wang, Tai; Chen, Li; wu, silei; Deng, Hanming; Gu, Yi; Lu, Lewei; Luo, Ping; Lin, Dahua; Li, Hongyang",poster,2306.02851,https://arxiv.org/abs/2306.02851,,https://huggingface.co/papers/2306.02851,,,,11,0
746
  Neural Scene Rasterization for Large Scene Rendering in Real-time,"Liu, Jeffrey Yunfan*; Chen, Yun; Yang, Ze; Wang, Jingkang; Manivasagam, Sivabalan; Urtasun, Raquel",poster,,,,,,,,,
747
  A Game of Bundle Adjustment - Learning Efficient Convergence,"Belder, Amir*; VIVANTI, REFAEL; Tal, Ayellet",poster,,,,,,,,,
748
  Efficient Transformer-based 3D Object Detection with Dynamic Token Halting,"Ye, Mao*; Meyer, Gregory P; Chai, Yuning; Liu, Qiang",poster,2303.05078,https://arxiv.org/abs/2303.05078,,https://huggingface.co/papers/2303.05078,,,,4,0
@@ -1060,7 +1060,7 @@ Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models,"Höllei
1060
  LivePose: Online 3D Reconstruction from Monocular Video with Dynamic Camera Poses,"Stier, Noah; Angles, Baptiste; Yang, Liang*; yan, yajie; Colburn, Alex; Chuang, Ming",oral,2304.00054,https://arxiv.org/abs/2304.00054,,https://huggingface.co/papers/2304.00054,,,,6,0
1061
  NDDepth: Normal-Distance Assisted Monocular Depth Estimation,"Shao, Shuwei*; pei, zhongcai; Chen, Weihai; Wu, Xingming; Li, Zhengguo",oral,,,,,,,,,
1062
  LATR: 3D Lane Detection from Monocular Images with Transformer,"Luo, Yueru; Zheng, Chaoda; Yan, Xu; Tang, Kun; zheng, chao; Cui, Shuguang; Li, Zhen*",oral,2308.04583,https://arxiv.org/abs/2308.04583,https://github.com/JMoonr/LATR,https://huggingface.co/papers/2308.04583,,,,7,0
1063
- DriveAdapter: Breaking the Coupling Barrier of Perception and Planning in End-to-End Autonomous Driving,"Jia, Xiaosong*; Gao, Yulu; Chen, Li; Yan, Junchi; Liu, Langechuan; Li, Hongyang",oral,2308.00398,https://arxiv.org/abs/2308.00398,,https://huggingface.co/papers/2308.00398,,,,6,0
1064
  Dynamic Point Fields,"Prokudin, Sergey*; Ma, Qianli; Raafat, Maxime; Valentin, Julien; Tang, Siyu",oral,2304.02626,https://arxiv.org/abs/2304.02626,,https://huggingface.co/papers/2304.02626,,,,5,0
1065
  Generalizing Neural Human Fitting to Unseen Pose With Articulated E(3) Equivariance,"Feng, Haiwen*; Kulits, Peter; Liu, Shichen; Black, Michael J.; Fernandez Abrevaya, Victoria",oral,,,,,,,,,
1066
  Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views,"Zhang, Siwei*; Ma, Qianli; Zhang, Yan; Aliakbarian, Sadegh; Cosker, Darren P; Tang, Siyu",oral,2304.06024,https://arxiv.org/abs/2304.06024,,https://huggingface.co/papers/2304.06024,,,,6,0
@@ -1582,7 +1582,7 @@ Efficient Deep Space Filling Curve,"Chen, Wanli *; Yao, Xufeng; Zhang, Xinyun; Y
1582
  Q-Diffusion: Quantizing Diffusion Models,"Li, Xiuyu*; Liu, Yijiang; Lian, Long; Yang, Huanrui; Dong, Zhen; Kang, Daniel; Zhang, Shanghang; Keutzer, Kurt",poster,,,,,,,,,
1583
  Lossy and Lossless (L$^2$) Post-training Model Size Compression,"Shi, Yumeng*; bai, shihao; Wei, Xiuying; Gong, Ruihao; Yang, Jianlei",poster,2308.04269,https://arxiv.org/abs/2308.04269,https://github.com/ModelTC/L2_Compression,https://huggingface.co/papers/2308.04269,,,,5,0
1584
  Robustifying Token Attention for Vision Transformers,"Guo, Yong*; Stutz, David; Schiele, Bernt",poster,2303.11126,https://arxiv.org/abs/2303.11126,,https://huggingface.co/papers/2303.11126,,,,3,0
1585
- Strivec: Sparse Tri-Vector Radiance Fields,"Xu, Qiangeng; Gao, Quankai*; Su, Hao; Neumann, Ulrich; Xu, Zexiang",poster,2307.13226,https://arxiv.org/abs/2307.13226,,https://huggingface.co/papers/2307.13226,,,,5,2
1586
  Image Features with Formal Privacy Guarantees,"Pittaluga, Francesco*; Zhuang, Bingbing",poster,,,,,,,,,
1587
  SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection,"Xie, Yichen*; Xu, Chenfeng; Rakotosaona, Marie-Julie; Rim, Patrick; Tombari, Federico; Keutzer, Kurt; TOMIZUKA, Masayoshi; Zhan, Wei",poster,2304.14340,https://arxiv.org/abs/2304.14340,https://github.com/yichen928/SparseFusion,https://huggingface.co/papers/2304.14340,,,,8,0
1588
  Strata-NeRF : Neural Radiance fields for Stratified Scenes,"Dhiman, Ankit*; R, Srinath; Rangwani, Harsh; Parihar, Rishubh; Boregowda, Lokesh; Sridhar, Srinath; RADHAKRISHNAN, Venkatesh Babu",poster,2308.10337,https://arxiv.org/abs/2308.10337,,https://huggingface.co/papers/2308.10337,,,,7,0
@@ -1761,7 +1761,7 @@ ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document
1761
  ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer,"Huang, Mingxin; Zhang, Jiaxin; Peng, Dezhi; Lu, Hao; Huang, Can; Liu, Yuliang; Bai, Xiang; Jin, Lianwen *",poster,2308.10147,https://arxiv.org/abs/2308.10147,https://github.com/mxin262/ESTextSpotter,https://huggingface.co/papers/2308.10147,,,,8,0
1762
  Few shot font generation via transferring similarity guided global style and quantization local style,"Pan, Wei; Zhu, Anna*; Zhou, Xinyu; Iwana, Brian K; Li, Shilin",poster,,,,,,,,,
1763
  Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration,"Cao, Haoyu*; Bao, Changcun; Liu, Chaohu; Chen, Huang; Yin, Kun; Liu, Hao; Liu, Yinsong; Jiang, Deqiang; Sun, Xing",poster,,,,,,,,,
1764
- Document Understanding Dataset and Evaluation (DUDE),"Van Landeghem, Jordy*; Tito, RubÚn; Borchmann, ?ukasz; Pietruszka, Micha?; Joziak, Pawel; Powalski, Rafal; Jurkiewicz, Dawid; Coustaty, Mickael; Anckaert, Bertrand; Valveny, Ernest; Blaschko, Matthew B.; Moens, Sien; Stanislawek, Tomasz",poster,2305.08455,https://arxiv.org/abs/2305.08455,,https://huggingface.co/papers/2305.08455,,,,13,1
1765
  LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition,"Cheng, Changxu*; Wang, Peng; Da, Cheng; Zheng, Qi; Yao, Cong",poster,2308.12774,https://arxiv.org/abs/2308.12774,,https://huggingface.co/papers/2308.12774,,,,5,0
1766
  MolGrapher: Graph-based Visual Recognition of Chemical Structures,"Morin, Lucas*; Danelljan, Martin; Agea, M. Isabel; Nassar, Ahmed S; weber, valery; Meijer, Gerhard Ingmar; Staar, Peter W J; Yu, Fisher",poster,2308.12234,https://arxiv.org/abs/2308.12234,,https://huggingface.co/papers/2308.12234,,,,8,0
1767
  SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap,"Kim, Daehee; Kim, Yoonsik*; Kim, DongHyun; Lim, Yumin; Kim, Geewook; Kil, Taeho",poster,,,,,,,,,
@@ -1993,7 +1993,7 @@ Generating Visual Scenes from Touch,"Yang, Fengyu*; Zhang, Jiacheng; Owens, Andr
1993
  Multimodal High-order Relation Transformer for Scene Boundary Detection,"Wei, Xi*; Shi, Zhangxiang; Zhang, Tianzhu; Yu, Xiaoyuan; Xiao, Lei",poster,,,,,,,,,
1994
  Muscles in Action,"Chiquier, Mia*; Vondrick, Carl",poster,2212.02978,https://arxiv.org/abs/2212.02978,,https://huggingface.co/papers/2212.02978,,,,2,0
1995
  Self-Evolved Dynamic Expansion Model for Task-Free Continual Learning,"Ye, Fei*; Bors, Adrian",poster,,,,,,,,,
1996
- Multi-event Video-Text Retrieval,"Zhang, Gengyuan*; Ren, Jisen; Gu, Jindong; Tresp, Volker",poster,2308.11551,https://arxiv.org/abs/2308.11551,https://github.com/gengyuanmax/MeVTR,https://huggingface.co/papers/2308.11551,,,,4,0
1997
  Referring Image Segmentation Using Text Supervision,"Liu, Fang*; Liu, Yuhao; Kong, Yuqiu; Xu, Ke; Zhang, Lihe; Yin, Baocai ; Hancke, Gerhard P.; Lau, Rynson W.H.",poster,2308.14575,https://arxiv.org/abs/2308.14575,https://github.com/fawnliu/TRIS,https://huggingface.co/papers/2308.14575,,,,8,0
1998
  Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning,"Guo, Xiaobao*; Muthuchamy Selvaraj, Nithish; Yu, Zitong; Kong, Wai-Kin Adams; Shen, Bingquan; Kot, Alex",poster,2303.12745,https://arxiv.org/abs/2303.12745,https://github.com/NMS05/Audio-Visual-Deception-Detection-DOLOS-Dataset-and-Parameter-Efficient-Crossmodal-Learning,https://huggingface.co/papers/2303.12745,,,,6,0
1999
  EMMN: Emotional Motion Memory Network for Audio-driven Emotional Talking Face Generation,"Tan, Shuai; Ji, Bin; pan, ye*",poster,,,,,,,,,
@@ -2001,7 +2001,7 @@ CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-tra
2001
  Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video,"Wu, Xiuzhe; Hu, Pengfei; Wu, Yang*; Lyu, Xiaoyang; Cao, Yan-Pei; Shan, Ying; Yang, Wenming; Sun, Zhongqian; Qi, Xiaojuan",poster,,,,,,,,,
2002
  GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training,"Deng, Xinchi*; Shi, Han; Huang, Runhui; Li, Changlin; Xu, Hang; Han, Jianhua; Kwok, James; Zhao, Shen; Zhang, Wei; Liang, Xiaodan",poster,2308.11331,https://arxiv.org/abs/2308.11331,,https://huggingface.co/papers/2308.11331,,,,10,0
2003
  A Retrospect to Multi-prompt Learning across Vision and Language,"Chen, Ziliang; Huang, Xin; Guan, Quanlong*; Lin, Liang; Luo, Weiqi",poster,,,,,,,,,
2004
- ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules,"Cheng , Zhi-Qi; Dai, Qi*; Hauptmann, Alexander ",poster,2304.02173,https://arxiv.org/abs/2304.02173,https://github.com/zhiqic/ChartReader,https://huggingface.co/papers/2304.02173,,,,6,0
2005
  Boosting Multi-modal Model Performance with Adaptive Gradient Modualtion,"Li, Hong*; Li, Xingyu; Hu, Pengbo ; Lei, Yinuo; Li, Chunxiao; Zhou, Yi",poster,,,,,,,,,
2006
  ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data,"Varma, Maya*; Delbrouck, Jean-Benoit; Hooper, Sarah; Chaudhari, Akshay S; Langlotz, Curtis",poster,2308.11194,https://arxiv.org/abs/2308.11194,,https://huggingface.co/papers/2308.11194,,,,5,0
2007
  Robust Referring Video Object Segmentation with Cyclic Structural Consensus,"Li, Xiang*; Wang, Jinglu; Xu, Xiaohao; Li, Xiao; Raj, Bhiksha; Lu, Yan",poster,,,,,,,,,
@@ -2065,7 +2065,7 @@ StyleLipSync: Style-based Personalized Lip-sync Video Generation,"Ki, Taekyung*;
2065
  StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation,"Wang, Yuhan*; Jiang, Liming; Loy, Chen Change",poster,2308.16909,https://arxiv.org/abs/2308.16909,,https://huggingface.co/papers/2308.16909,,,,3,0
2066
  3D-Aware Generative Model for Improved Side-View Image Synthesis,"Jo, Kyungmin; Jin, Wonjoon*; Choo, Jaegul; Lee, Hyunjoon; Cho, Sunghyun",poster,,,,,,,,,
2067
  Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer,"Yang, Serin*; HWANG, HYUNMIN; Ye, Jong Chul",poster,2303.08622,https://arxiv.org/abs/2303.08622,,https://huggingface.co/papers/2303.08622,,,,3,0
2068
- FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis,"Seo, Seunghyeon; Chang, Yeonjin; Kwak, Nojun*",poster,2306.17723,https://arxiv.org/abs/2306.17723,,https://huggingface.co/papers/2306.17723,,,,3,0
2069
  Inverse problem regularization with hierarchical variational autoencoders,"Prost, Jean*; Houdard, Antoine; Almansa, Andres; Papadakis, Nicolas",poster,2303.11217,https://arxiv.org/abs/2303.11217,,https://huggingface.co/papers/2303.11217,,,,4,0
2070
  3D-aware Blending with Generative NeRFs,"Kim, Hyunsu*; Lee, Gayoung; Choi, Yunjey; Kim, Jin-Hwa; Zhu, Jun-Yan",poster,2302.06608,https://arxiv.org/abs/2302.06608,,https://huggingface.co/papers/2302.06608,,,,5,0
2071
  NeMF: Inverse Volume Rendering with Neural Microflake Field,"Zhang, Youjia; Xu, Teng; Yu, Junqing; Ye, YuTeng; Wang, Junle; Jing , Yanqing; Yu, Jingyi; Yang, Wei*",poster,2304.00782,https://arxiv.org/abs/2304.00782,,https://huggingface.co/papers/2304.00782,,,,8,0
@@ -2100,7 +2100,7 @@ Multi-view Spectral Polarization Propagation for Video Glass Segmentation,"Qiao,
2100
  WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction,"Le Moing, Guillaume*; Ponce, Jean; Schmid, Cordelia",poster,2211.14308,https://arxiv.org/abs/2211.14308,,https://huggingface.co/papers/2211.14308,,,,3,1
2101
  Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation,"Chen, Eric M*; Holalkere, Sidhanth; Yan, Ruyu; Zhang, Kai; Davis, Abe",poster,2304.13681,https://arxiv.org/abs/2304.13681,,https://huggingface.co/papers/2304.13681,,,,5,0
2102
  Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models,"Lee, Jaewoong*; Jang, Sangwon; Jo, Jaehyeong; Yoon, Jaehong; Kim, Yunji; Kim, Jin-Hwa; Ha, Jung-Woo; Hwang, Sung Ju",poster,2304.01515,https://arxiv.org/abs/2304.01515,,https://huggingface.co/papers/2304.01515,,,,8,1
2103
- Efficient Video Prediction via Sparsely Conditioned Flow Matching,"Davtyan, Aram*; Sameni, Sepehr; Favaro, Paolo",poster,2211.14575,https://arxiv.org/abs/2211.14575,,https://huggingface.co/papers/2211.14575,,,,3,0
2104
  Democratising 2D Sketch to 3D Shape Retrieval Through Pivoting.,"Chowdhury, Pinaki Nath*; Bhunia , Ayan Kumar; Sain, Aneeshan; Koley, Subhadeep; Xiang, Tao; Song, Yi-Zhe",poster,,,,,,,,,
2105
  Towards Instance-adaptive Inference for Federated Learning,"Feng, Chun-Mei*; Yu, Kai; Liu, Nian; Xu, Xinxing; Khan, Salman; Zuo, Wangmeng",poster,2308.06051,https://arxiv.org/abs/2308.06051,,https://huggingface.co/papers/2308.06051,,,,6,0
2106
  TransTIC: Transferring Transformer-based Image Compression from Human Visualization to Machine Perception,"Chen, Yi-Hsin; Weng, Ying-Chieh; Kao, Chia Hao; CHIEN, CHENG; Chiu, Wei-Chen; Peng, Wen-Hsiao*",poster,,,,,,,,,
 
575
  DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds,"Ma, Tao*; Yang, Xuemeng; Zhou, Hongbin; Li, Xin; Shi, Botian; Liu, Junjie; Yang, Yuchen; Liu, Zhizheng; He, Liang; Li, Hongsheng; Li, Yikang; Qiao, Yu",poster,2306.06023,https://arxiv.org/abs/2306.06023,,https://huggingface.co/papers/2306.06023,,,,12,0
576
  DETRs with Collaborative Hybrid Assignments Training,"Zong, Zhuofan*; Song, Guanglu; Liu, Yu",poster,2211.12860,https://arxiv.org/abs/2211.12860,https://github.com/Sense-X/Co-DETR,https://huggingface.co/papers/2211.12860,,,,3,0
577
  Open Vocabulary Object Detection With an Open Corpus,"Wang, Jiong*; zhang, huiming; Hong, Haiwen; Jin, Xuan; He, Yuan; xue, hui; Zhao, Zhou",poster,,,,,,,,,
578
+ SparseDet: Improving Sparsely Annotated Object Detection with Pseudo-positive Mining,"Suri, Saksham*; Rambhatla, Sai Saketh ; Chellappa, Rama; Shrivastava, Abhinav",poster,2201.04620,https://arxiv.org/abs/2201.04620,,https://huggingface.co/papers/2201.04620,,,,4,3
579
  Unsupervised Anomaly Detection with Diffusion Probabilistic Model,"Zhang, Xinyi*; Li, Naiqi; Li, Jiawei; Dai, Tao; Jiang, Yong; Xia, Shu-Tao",poster,,,,,,,,,
580
  UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation,"Wang, Haiyang*; Tang, Hao; Shi, Shaoshuai; Li, Aoxue; Li, Zhenguo; Schiele, Bernt; Wang, Liwei",poster,,,,,,,,,
581
  Focus the Discrepancy: Intra- and Inter-Correlation Learning for Image Anomaly Detection,"Yao, Xincheng*; Li, Ruoqi; Qian, Zefeng; Luo, Yan; Zhang, Chongyang",poster,,,,,,,,,
 
592
  FB-BEV: BEV Representation from Forward-Backward View Transformations,"Li, Zhiqi*; Yu, Zhiding; Wang, Wenhai; Anandkumar, Animashree; Lu, Tong; Alvarez, Jose M",poster,,,,,,,,,
593
  Learning from Noisy Data for Semi-Supervised 3D Object Detection,"Chen, Zehui; Li, Zhenyu; Wang, Shuo; Fu, Dengpan; Zhao, Feng*",poster,,,,,,,,,
594
  Boosting Long-tailed Object Detection via Step-wise Learning on Smooth-tail Data,"Dong, Na*; Zhang, Yongqiang; Ding, Mingli; Lee, Gim Hee",poster,2305.12833,https://arxiv.org/abs/2305.12833,,https://huggingface.co/papers/2305.12833,,,,4,0
595
+ Objects do not disappear: Video object detection by single-frame object location anticipation,"Liu, Xin*; Karimi Nejadasl, Fatemeh; van Gemert, Jan C; Booij, Olaf; Pintea, Silvia L",poster,2308.04770,https://arxiv.org/abs/2308.04770,https://github.com/L-KID/Videoobject-detection-by-location-anticipation,https://huggingface.co/papers/2308.04770,,,,5,1
596
  Unified Visual Relationship Detection with Vision and Language Models,"Zhao, Long*; Yuan, Liangzhe; Gong, Boqing; Cui, Yin; Schroff, Florian; Yang, Ming-Hsuan; Adam, Hartwig; Liu, Ting",poster,2303.08998,https://arxiv.org/abs/2303.08998,,https://huggingface.co/papers/2303.08998,,,,8,1
597
  Universal Domain Adaptation via Compressive Attention Matching,"zhu, didi; Li, Yinchuan; Yuan, Junkun; Li, Zexi; Kuang, Kun; Wu, Chao*",poster,2304.11862,https://arxiv.org/abs/2304.11862,,https://huggingface.co/papers/2304.11862,,,,6,0
598
  Unsupervised Domain Adaptive Detection with Network Stability Analysis,"Zhou, Wenzhang; Fan, Heng; Luo, Tiejian; Zhang, Libo*",poster,2308.08182,https://arxiv.org/abs/2308.08182,https://github.com/tiankongzhang/NSA,https://huggingface.co/papers/2308.08182,,,,4,0
 
639
  Prompt Tuning Inversion for Text-driven Image Editing Using Diffusion Models,"Dong, Wenkai*; Duan, Xiaoyue; Xue, Song; Han, Shumin",poster,2305.04441,https://arxiv.org/abs/2305.04441,,https://huggingface.co/papers/2305.04441,,,,4,0
640
  Efficient Diffusion Training via Min-SNR Weighting Strategy,"Hang, Tiankai; Gu, Shuyang*; Li, Chen; Bao, Jianmin; Chen, Dong; Hu, Han; Geng, Xin; Guo, Baining",poster,2303.09556,https://arxiv.org/abs/2303.09556,https://github.com/TiankaiHang/Min-SNR-Diffusion-Training,https://huggingface.co/papers/2303.09556,,,,8,0
641
  BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion,"Xie, Jinheng; Li, Yuexiang; Huang, Yawen; Liu, Haozhe; Zhang, Wentian; Zheng, Yefeng; Shou, Mike Zheng*",poster,2307.10816,https://arxiv.org/abs/2307.10816,https://github.com/showlab/BoxDiff,https://huggingface.co/papers/2307.10816,,,,7,0
642
+ Improving Sample Quality of Diffusion Models Using Self-Attention Guidance,"Hong, Susung*; Lee, Gyuseong; Jang, Wooseok; Kim, Seungryong",poster,2210.00939,https://arxiv.org/abs/2210.00939,,https://huggingface.co/papers/2210.00939,https://github.com/KU-CVLAB/Self-Attention-Guidance,,,4,1
643
  Not All Steps are Created Equal: Selective Diffusion Distillation for Image Manipulation,"WANG, Luozhou*; Yang, Shuai; Liu, Shu; Chen, Yingcong",poster,2307.08448,https://arxiv.org/abs/2307.08448,https://github.com/AndysonYs/Selective-Diffusion-Distillation,https://huggingface.co/papers/2307.08448,,,,4,0
644
  Deep Image Harmonization with Learnable Augmentation,"Niu, Li*; Cao, Junyan; Cong, Wenyan; Zhang, Liqing",poster,2308.00376,https://arxiv.org/abs/2308.00376,https://github.com/bcmi/SycoNet-Adaptive-Image-Harmonization,https://huggingface.co/papers/2308.00376,,,,4,0
645
  Out-of-domain GAN inversion via Invertibility Decomposition for Photo-Realistic Human Face Manipulation,"YANG, Xin*; XU, Xiaogang; Chen, Yingcong",poster,2212.09262,https://arxiv.org/abs/2212.09262,,https://huggingface.co/papers/2212.09262,,,,3,0
 
666
  Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation,"Niu, Li*; Tan, Linfeng; Tao, Xinhao; Cao, Junyan; Guo, Fengjun; Long, Teng; Zhang, Liqing",poster,2308.00356,https://arxiv.org/abs/2308.00356,https://github.com/bcmi/Image-Harmonization-Dataset-ccHarmony,https://huggingface.co/papers/2308.00356,,,,7,0
667
  One-Shot Generative Domain Adaptation,"Yang, Ceyuan*; Shen, Yujun; Zhang, Zhiyi; Xu, Yinghao; Zhu, Jiapeng; Wu, Zhirong; Zhou, Bolei",poster,2111.09876,https://arxiv.org/abs/2111.09876,,https://huggingface.co/papers/2111.09876,,,,7,0
668
  Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time,"Chan, Cheng-Hung; Yuan, Cheng-Yang; Sun, Cheng; Chen, Hwann-Tzong*",poster,,,,,,,,,
669
+ "Versatile Diffusion: Text, Images and Variations All in One Diffusion Model","Xu, Xingqian*; Wang, Zhangyang; Zhang, Gong; Wang, Kai; Shi, Humphrey",poster,2211.08332,https://arxiv.org/abs/2211.08332,https://github.com/SHI-Labs/Versatile-Diffusion,https://huggingface.co/papers/2211.08332,,,,5,1
670
  Sound Source Localization is All about Cross-Modal Alignment,"Senocak, Arda*; Ryu, Hyeonggon; Kim, Junsik; Oh, Tae-Hyun; Pfister, Hanspeter; Chung, Joon Son",poster,,,,,,,,,
671
  Class-Incremental Grouping Network for Continual Audio-Visual Learning,"Mo, Shentong; Pian, Weiguo; Tian, Yapeng*",poster,,,,,,,,,
672
  Audio-Visual Class-Incremental Learning,"Pian, Weiguo*; Mo, Shentong; Guo, Yunhui; Tian, Yapeng",poster,2308.11073,https://arxiv.org/abs/2308.11073,https://github.com/weiguoPian/AV-CIL_ICCV2023,https://huggingface.co/papers/2308.11073,,,,4,0
 
742
  A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection,"Zhang, Dingyuan*; Liang, Dingkang; Zou, Zhikang; Li, Jingyu; Ye, Xiaoqing; Tan, Xiao; Liu, Zhe; Bai, Xiang",poster,,,,,,,,,
743
  Learn TAROT with MENTOR: A Meta-Learned Self-supervised Approach for Trajectory Prediction,"Pourkeshavarz, Mozhgan MP*; Chen, Changhe; Rasouli, Amir",poster,,,,,,,,,
744
  FocalFormer3D : Focusing on Hard Instance for 3D Object Detection,"Chen, Yilun*; Yu, Zhiding; Chen, Yukang; Lan, Shiyi; Anandkumar, Animashree; Jia, Jiaya; Alvarez, Jose M",poster,2308.04556,https://arxiv.org/abs/2308.04556,https://github.com/NVlabs/FocalFormer3D,https://huggingface.co/papers/2308.04556,,,,7,1
745
+ Scene as Occupancy,"Tong, Wenwen; Sima, Chonghao*; Wang, Tai; Chen, Li; wu, silei; Deng, Hanming; Gu, Yi; Lu, Lewei; Luo, Ping; Lin, Dahua; Li, Hongyang",poster,2306.02851,https://arxiv.org/abs/2306.02851,,https://huggingface.co/papers/2306.02851,,,,11,1
746
  Neural Scene Rasterization for Large Scene Rendering in Real-time,"Liu, Jeffrey Yunfan*; Chen, Yun; Yang, Ze; Wang, Jingkang; Manivasagam, Sivabalan; Urtasun, Raquel",poster,,,,,,,,,
747
  A Game of Bundle Adjustment - Learning Efficient Convergence,"Belder, Amir*; VIVANTI, REFAEL; Tal, Ayellet",poster,,,,,,,,,
748
  Efficient Transformer-based 3D Object Detection with Dynamic Token Halting,"Ye, Mao*; Meyer, Gregory P; Chai, Yuning; Liu, Qiang",poster,2303.05078,https://arxiv.org/abs/2303.05078,,https://huggingface.co/papers/2303.05078,,,,4,0
 
1060
  LivePose: Online 3D Reconstruction from Monocular Video with Dynamic Camera Poses,"Stier, Noah; Angles, Baptiste; Yang, Liang*; yan, yajie; Colburn, Alex; Chuang, Ming",oral,2304.00054,https://arxiv.org/abs/2304.00054,,https://huggingface.co/papers/2304.00054,,,,6,0
1061
  NDDepth: Normal-Distance Assisted Monocular Depth Estimation,"Shao, Shuwei*; pei, zhongcai; Chen, Weihai; Wu, Xingming; Li, Zhengguo",oral,,,,,,,,,
1062
  LATR: 3D Lane Detection from Monocular Images with Transformer,"Luo, Yueru; Zheng, Chaoda; Yan, Xu; Tang, Kun; zheng, chao; Cui, Shuguang; Li, Zhen*",oral,2308.04583,https://arxiv.org/abs/2308.04583,https://github.com/JMoonr/LATR,https://huggingface.co/papers/2308.04583,,,,7,0
1063
+ DriveAdapter: Breaking the Coupling Barrier of Perception and Planning in End-to-End Autonomous Driving,"Jia, Xiaosong*; Gao, Yulu; Chen, Li; Yan, Junchi; Liu, Langechuan; Li, Hongyang",oral,2308.00398,https://arxiv.org/abs/2308.00398,,https://huggingface.co/papers/2308.00398,,,,6,1
1064
  Dynamic Point Fields,"Prokudin, Sergey*; Ma, Qianli; Raafat, Maxime; Valentin, Julien; Tang, Siyu",oral,2304.02626,https://arxiv.org/abs/2304.02626,,https://huggingface.co/papers/2304.02626,,,,5,0
1065
  Generalizing Neural Human Fitting to Unseen Pose With Articulated E(3) Equivariance,"Feng, Haiwen*; Kulits, Peter; Liu, Shichen; Black, Michael J.; Fernandez Abrevaya, Victoria",oral,,,,,,,,,
1066
  Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views,"Zhang, Siwei*; Ma, Qianli; Zhang, Yan; Aliakbarian, Sadegh; Cosker, Darren P; Tang, Siyu",oral,2304.06024,https://arxiv.org/abs/2304.06024,,https://huggingface.co/papers/2304.06024,,,,6,0
 
1582
  Q-Diffusion: Quantizing Diffusion Models,"Li, Xiuyu*; Liu, Yijiang; Lian, Long; Yang, Huanrui; Dong, Zhen; Kang, Daniel; Zhang, Shanghang; Keutzer, Kurt",poster,,,,,,,,,
1583
  Lossy and Lossless (L$^2$) Post-training Model Size Compression,"Shi, Yumeng*; bai, shihao; Wei, Xiuying; Gong, Ruihao; Yang, Jianlei",poster,2308.04269,https://arxiv.org/abs/2308.04269,https://github.com/ModelTC/L2_Compression,https://huggingface.co/papers/2308.04269,,,,5,0
1584
  Robustifying Token Attention for Vision Transformers,"Guo, Yong*; Stutz, David; Schiele, Bernt",poster,2303.11126,https://arxiv.org/abs/2303.11126,,https://huggingface.co/papers/2303.11126,,,,3,0
1585
+ Strivec: Sparse Tri-Vector Radiance Fields,"Xu, Qiangeng; Gao, Quankai*; Su, Hao; Neumann, Ulrich; Xu, Zexiang",poster,2307.13226,https://arxiv.org/abs/2307.13226,,https://huggingface.co/papers/2307.13226,,,,5,3
1586
  Image Features with Formal Privacy Guarantees,"Pittaluga, Francesco*; Zhuang, Bingbing",poster,,,,,,,,,
1587
  SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection,"Xie, Yichen*; Xu, Chenfeng; Rakotosaona, Marie-Julie; Rim, Patrick; Tombari, Federico; Keutzer, Kurt; TOMIZUKA, Masayoshi; Zhan, Wei",poster,2304.14340,https://arxiv.org/abs/2304.14340,https://github.com/yichen928/SparseFusion,https://huggingface.co/papers/2304.14340,,,,8,0
1588
  Strata-NeRF : Neural Radiance fields for Stratified Scenes,"Dhiman, Ankit*; R, Srinath; Rangwani, Harsh; Parihar, Rishubh; Boregowda, Lokesh; Sridhar, Srinath; RADHAKRISHNAN, Venkatesh Babu",poster,2308.10337,https://arxiv.org/abs/2308.10337,,https://huggingface.co/papers/2308.10337,,,,7,0
 
1761
  ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer,"Huang, Mingxin; Zhang, Jiaxin; Peng, Dezhi; Lu, Hao; Huang, Can; Liu, Yuliang; Bai, Xiang; Jin, Lianwen *",poster,2308.10147,https://arxiv.org/abs/2308.10147,https://github.com/mxin262/ESTextSpotter,https://huggingface.co/papers/2308.10147,,,,8,0
1762
  Few shot font generation via transferring similarity guided global style and quantization local style,"Pan, Wei; Zhu, Anna*; Zhou, Xinyu; Iwana, Brian K; Li, Shilin",poster,,,,,,,,,
1763
  Attention Where It Matters: Rethinking Visual Document Understanding with Selective Region Concentration,"Cao, Haoyu*; Bao, Changcun; Liu, Chaohu; Chen, Huang; Yin, Kun; Liu, Hao; Liu, Yinsong; Jiang, Deqiang; Sun, Xing",poster,,,,,,,,,
1764
+ Document Understanding Dataset and Evaluation (DUDE),"Van Landeghem, Jordy*; Tito, RubÚn; Borchmann, ?ukasz; Pietruszka, Micha?; Joziak, Pawel; Powalski, Rafal; Jurkiewicz, Dawid; Coustaty, Mickael; Anckaert, Bertrand; Valveny, Ernest; Blaschko, Matthew B.; Moens, Sien; Stanislawek, Tomasz",poster,2305.08455,https://arxiv.org/abs/2305.08455,,https://huggingface.co/papers/2305.08455,,,,13,2
1765
  LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition,"Cheng, Changxu*; Wang, Peng; Da, Cheng; Zheng, Qi; Yao, Cong",poster,2308.12774,https://arxiv.org/abs/2308.12774,,https://huggingface.co/papers/2308.12774,,,,5,0
1766
  MolGrapher: Graph-based Visual Recognition of Chemical Structures,"Morin, Lucas*; Danelljan, Martin; Agea, M. Isabel; Nassar, Ahmed S; weber, valery; Meijer, Gerhard Ingmar; Staar, Peter W J; Yu, Fisher",poster,2308.12234,https://arxiv.org/abs/2308.12234,,https://huggingface.co/papers/2308.12234,,,,8,0
1767
  SCOB: Universal Text Understanding via Character-wise Supervised Contrastive Learning with Online Text Rendering for Bridging Domain Gap,"Kim, Daehee; Kim, Yoonsik*; Kim, DongHyun; Lim, Yumin; Kim, Geewook; Kil, Taeho",poster,,,,,,,,,
 
1993
  Multimodal High-order Relation Transformer for Scene Boundary Detection,"Wei, Xi*; Shi, Zhangxiang; Zhang, Tianzhu; Yu, Xiaoyuan; Xiao, Lei",poster,,,,,,,,,
1994
  Muscles in Action,"Chiquier, Mia*; Vondrick, Carl",poster,2212.02978,https://arxiv.org/abs/2212.02978,,https://huggingface.co/papers/2212.02978,,,,2,0
1995
  Self-Evolved Dynamic Expansion Model for Task-Free Continual Learning,"Ye, Fei*; Bors, Adrian",poster,,,,,,,,,
1996
+ Multi-event Video-Text Retrieval,"Zhang, Gengyuan*; Ren, Jisen; Gu, Jindong; Tresp, Volker",poster,2308.11551,https://arxiv.org/abs/2308.11551,https://github.com/gengyuanmax/MeVTR,https://huggingface.co/papers/2308.11551,,,,4,1
1997
  Referring Image Segmentation Using Text Supervision,"Liu, Fang*; Liu, Yuhao; Kong, Yuqiu; Xu, Ke; Zhang, Lihe; Yin, Baocai ; Hancke, Gerhard P.; Lau, Rynson W.H.",poster,2308.14575,https://arxiv.org/abs/2308.14575,https://github.com/fawnliu/TRIS,https://huggingface.co/papers/2308.14575,,,,8,0
1998
  Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning,"Guo, Xiaobao*; Muthuchamy Selvaraj, Nithish; Yu, Zitong; Kong, Wai-Kin Adams; Shen, Bingquan; Kot, Alex",poster,2303.12745,https://arxiv.org/abs/2303.12745,https://github.com/NMS05/Audio-Visual-Deception-Detection-DOLOS-Dataset-and-Parameter-Efficient-Crossmodal-Learning,https://huggingface.co/papers/2303.12745,,,,6,0
1999
  EMMN: Emotional Motion Memory Network for Audio-driven Emotional Talking Face Generation,"Tan, Shuai; Ji, Bin; pan, ye*",poster,,,,,,,,,
 
2001
  Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video,"Wu, Xiuzhe; Hu, Pengfei; Wu, Yang*; Lyu, Xiaoyang; Cao, Yan-Pei; Shan, Ying; Yang, Wenming; Sun, Zhongqian; Qi, Xiaojuan",poster,,,,,,,,,
2002
  GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training,"Deng, Xinchi*; Shi, Han; Huang, Runhui; Li, Changlin; Xu, Hang; Han, Jianhua; Kwok, James; Zhao, Shen; Zhang, Wei; Liang, Xiaodan",poster,2308.11331,https://arxiv.org/abs/2308.11331,,https://huggingface.co/papers/2308.11331,,,,10,0
2003
  A Retrospect to Multi-prompt Learning across Vision and Language,"Chen, Ziliang; Huang, Xin; Guan, Quanlong*; Lin, Liang; Luo, Weiqi",poster,,,,,,,,,
2004
+ ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules,"Cheng , Zhi-Qi; Dai, Qi*; Hauptmann, Alexander ",poster,2304.02173,https://arxiv.org/abs/2304.02173,https://github.com/zhiqic/ChartReader,https://huggingface.co/papers/2304.02173,,,,6,1
2005
  Boosting Multi-modal Model Performance with Adaptive Gradient Modualtion,"Li, Hong*; Li, Xingyu; Hu, Pengbo ; Lei, Yinuo; Li, Chunxiao; Zhou, Yi",poster,,,,,,,,,
2006
  ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data,"Varma, Maya*; Delbrouck, Jean-Benoit; Hooper, Sarah; Chaudhari, Akshay S; Langlotz, Curtis",poster,2308.11194,https://arxiv.org/abs/2308.11194,,https://huggingface.co/papers/2308.11194,,,,5,0
2007
  Robust Referring Video Object Segmentation with Cyclic Structural Consensus,"Li, Xiang*; Wang, Jinglu; Xu, Xiaohao; Li, Xiao; Raj, Bhiksha; Lu, Yan",poster,,,,,,,,,
 
2065
  StyleInV: A Temporal Style Modulated Inversion Network for Unconditional Video Generation,"Wang, Yuhan*; Jiang, Liming; Loy, Chen Change",poster,2308.16909,https://arxiv.org/abs/2308.16909,,https://huggingface.co/papers/2308.16909,,,,3,0
2066
  3D-Aware Generative Model for Improved Side-View Image Synthesis,"Jo, Kyungmin; Jin, Wonjoon*; Choo, Jaegul; Lee, Hyunjoon; Cho, Sunghyun",poster,,,,,,,,,
2067
  Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer,"Yang, Serin*; HWANG, HYUNMIN; Ye, Jong Chul",poster,2303.08622,https://arxiv.org/abs/2303.08622,,https://huggingface.co/papers/2303.08622,,,,3,0
2068
+ FlipNeRF: Flipped Reflection Rays for Few-shot Novel View Synthesis,"Seo, Seunghyeon; Chang, Yeonjin; Kwak, Nojun*",poster,2306.17723,https://arxiv.org/abs/2306.17723,,https://huggingface.co/papers/2306.17723,,,,3,1
2069
  Inverse problem regularization with hierarchical variational autoencoders,"Prost, Jean*; Houdard, Antoine; Almansa, Andres; Papadakis, Nicolas",poster,2303.11217,https://arxiv.org/abs/2303.11217,,https://huggingface.co/papers/2303.11217,,,,4,0
2070
  3D-aware Blending with Generative NeRFs,"Kim, Hyunsu*; Lee, Gayoung; Choi, Yunjey; Kim, Jin-Hwa; Zhu, Jun-Yan",poster,2302.06608,https://arxiv.org/abs/2302.06608,,https://huggingface.co/papers/2302.06608,,,,5,0
2071
  NeMF: Inverse Volume Rendering with Neural Microflake Field,"Zhang, Youjia; Xu, Teng; Yu, Junqing; Ye, YuTeng; Wang, Junle; Jing , Yanqing; Yu, Jingyi; Yang, Wei*",poster,2304.00782,https://arxiv.org/abs/2304.00782,,https://huggingface.co/papers/2304.00782,,,,8,0
 
2100
  WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction,"Le Moing, Guillaume*; Ponce, Jean; Schmid, Cordelia",poster,2211.14308,https://arxiv.org/abs/2211.14308,,https://huggingface.co/papers/2211.14308,,,,3,1
2101
  Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation,"Chen, Eric M*; Holalkere, Sidhanth; Yan, Ruyu; Zhang, Kai; Davis, Abe",poster,2304.13681,https://arxiv.org/abs/2304.13681,,https://huggingface.co/papers/2304.13681,,,,5,0
2102
  Text-Conditioned Sampling Framework for Text-to-Image Generation with Masked Generative Models,"Lee, Jaewoong*; Jang, Sangwon; Jo, Jaehyeong; Yoon, Jaehong; Kim, Yunji; Kim, Jin-Hwa; Ha, Jung-Woo; Hwang, Sung Ju",poster,2304.01515,https://arxiv.org/abs/2304.01515,,https://huggingface.co/papers/2304.01515,,,,8,1
2103
+ Efficient Video Prediction via Sparsely Conditioned Flow Matching,"Davtyan, Aram*; Sameni, Sepehr; Favaro, Paolo",poster,2211.14575,https://arxiv.org/abs/2211.14575,,https://huggingface.co/papers/2211.14575,,,,3,1
2104
  Democratising 2D Sketch to 3D Shape Retrieval Through Pivoting.,"Chowdhury, Pinaki Nath*; Bhunia , Ayan Kumar; Sain, Aneeshan; Koley, Subhadeep; Xiang, Tao; Song, Yi-Zhe",poster,,,,,,,,,
2105
  Towards Instance-adaptive Inference for Federated Learning,"Feng, Chun-Mei*; Yu, Kai; Liu, Nian; Xu, Xinxing; Khan, Salman; Zuo, Wangmeng",poster,2308.06051,https://arxiv.org/abs/2308.06051,,https://huggingface.co/papers/2308.06051,,,,6,0
2106
  TransTIC: Transferring Transformer-based Image Compression from Human Visualization to Machine Perception,"Chen, Yi-Hsin; Weng, Ying-Chieh; Kao, Chia Hao; CHIEN, CHENG; Chiu, Wei-Chen; Peng, Wen-Hsiao*",poster,,,,,,,,,