Add/improve model card for LVSM

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +61 -3
README.md CHANGED
@@ -1,3 +1,61 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: image-to-image
4
+ library_name: pytorch
5
+ ---
6
+
7
+ # LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias
8
+
9
+ This repository contains a re-implementation of the LVSM model described in [LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias](https://arxiv.org/abs/2410.17242). The provided checkpoints are from the original Adobe implementation.
10
+
11
+ Project Page: https://haian-jin.github.io/projects/LVSM/
12
+
13
+ ## Checkpoints
14
+
15
+ The scene-level evaluation is conducted on the [RealEstate10K](http://schadenfreude.csail.mit.edu:8000/) dataset.
16
+
17
+ | Model | PSNR | SSIM | LPIPS |
18
+ |----------------------------------------------------------------------------|-----|-----|-----|
19
+ | [LVSM Decoder-Only Scene-Level res256 (full)](https://huggingface.co/coast01/LVSM/resolve/main/scene_decoder_only_256.pt?download=true) | 29.67 | 0.906 | 0.098 |
20
+ | [LVSM Encoder-Decoder Scene-Level res256 (full)](https://huggingface.co/coast01/LVSM/resolve/main/scene_encoder_decoder_256.pt?download=true) | 28.60 | 0.893 | 0.114 |
21
+
22
+
23
+ ## How to use
24
+
25
+ This example demonstrates loading a pre-trained LVSM Decoder-Only model using the `torch` library. Make sure to install the necessary dependencies and download a checkpoint.
26
+
27
+ ```python
28
+ import torch
29
+ from lvsm.models.lvsm import LVSM # Assuming this is the correct import path
30
+
31
+ # Load the model
32
+ checkpoint_path = "path/to/your/checkpoint.pt" # Replace with your checkpoint path
33
+ model = LVSM.load_from_checkpoint(checkpoint_path)
34
+
35
+ # Move model to GPU if available
36
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
37
+ model.to(device)
38
+
39
+ # Example input (replace with your actual image data)
40
+ # ...
41
+
42
+ # Perform inference
43
+ # output = model(input)
44
+ # ...
45
+ ```
46
+
47
+ ## Citation
48
+
49
+ ```bibtex
50
+ @inproceedings{
51
+ jin2025lvsm,
52
+ title={LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias},
53
+ author={Haian Jin and Hanwen Jiang and Hao Tan and Kai Zhang and Sai Bi and Tianyuan Zhang and Fujun Luan and Noah Snavely and Zexiang Xu},
54
+ booktitle={The Thirteenth International Conference on Learning Representations},
55
+ year={2025},
56
+ url={https://openreview.net/forum?id=QQBPWtvtcn}
57
+ }
58
+ ```
59
+
60
+ ## Acknowledgement
61
+ We thank Kalyan Sunkavalli for helpful discussions and support. This work was done when Haian Jin, Hanwen Jiang, and Tianyuan Zhang were research interns at Adobe Research. This work was also partly funded by the National Science Foundation (IIS-2211259, IIS-2212084).