|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- ILSVRC/imagenet-1k |
|
model-index: |
|
- name: MaskBit-Tokenizer-14bits |
|
results: |
|
- task: |
|
type: image-generation |
|
dataset: |
|
name: ILSVRC/imagenet-1k |
|
type: ILSVRC/imagenet-1k |
|
metrics: |
|
- name: rFID |
|
type: rFID |
|
value: 1.37 |
|
- name: InceptionScore |
|
type: InceptionScore |
|
value: 190.3 |
|
- name: LPIPS |
|
type: LPIPS |
|
value: 0.286 |
|
- name: PSNR |
|
type: PSNR |
|
value: 21.5 |
|
- name: SSIM |
|
type: SSIM |
|
value: 0.56 |
|
- name: CodebookUsage |
|
type: CodebookUsage |
|
value: 1.0 |
|
--- |
|
|
|
This model is the MaskBit tokenizer with a vocabulary size of 14bits. It uses a downsampling factor of 16 and is trained on ImageNet for images of resolution 256. |
|
|
|
You can find more details on the [project page](https://weber-mark.github.io/projects/maskbit.html) and in the [paper](https://arxiv.org/abs/2409.16211). |