seungminh
/

zero123-pro_v1.0

Model card Files Files and versions Community

seungminh commited on Jan 26, 2024

Commit

adcc4b9

•

1 Parent(s): 7f9bd4c

Update README.md

Files changed (1) hide show

README.md +50 -0

README.md CHANGED Viewed

@@ -1,3 +1,53 @@
 ---
 license: mit
 ---

 ---
+datasets:
+- allenai/objaverse-xl
+tags:
+- 3d
+extra_gated_fields:
+  Name: text
+  Email: text
+  Country: text
+  Organization or Affiliation: text
+  I ALLOW Stability AI to email me about new model releases: checkbox
 license: mit
+license_name: sai-nc-community
+pipeline_tag: text-to-3d
 ---
+# Zero123-pro_v1
+## Model Description
+Zero123-pro is a fine-tuned model for *high-resolution* view-conditioned image generation based on [Zero123](https://github.com/cvlab-columbia/zero123).
+Currently, our model is pursuing 512x512 resolution and we are still trying to find a best way to train high-resolution because convergence is not easy.
+This model is currently fine-tuned only with *chair* dataset, but a foundation model suitable for e-commerce will be released later.
+## Usage
+Use the config file modified from an original zero123 code base.
+Our model has an output resolution of 512, and the corresponding latent dimension is 64. Therefore, first_stage_config resolution should be corrected to 512 and image_size to 64.
+To get good quality, please use image of 1:1 aspect ratio as an input.
+## Model Details
+* **Developed by**: Seungmin Ha, Yeonju Kim
+* **Model type**: latent diffusion model.
+* **Finetuned from model**: [lambdalabs/sd-image-variations-diffusers](https://huggingface.co/lambdalabs/sd-image-variations-diffusers)
+* **License**: We released 1st. version of Zero123-pro.
+    * Some of the data used in **Zero123-pro** cannot be used for commercial purposes, but it can be used for research purposes.
+According to our internal tests, both models perform similarly in terms of prediction visual quality.
+### Training Infrastructure
+* **Hardware**: `Zero123-pro` was trained on the cluster on a single node with 8 A100 80GiBs GPUs.
+* **Code Base**: We use our modified version of [the original zero123 repository](https://github.com/cvlab-columbia/zero123).
+### Misuse, Malicious Use, and Out-of-Scope Use
+The model should not be used to intentionally create or disseminate images that create hostile or alienating environments for people. This includes generating images that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes.