Eval Results
markweber's picture
Update README.md
cf7f5b8 verified
metadata
license: apache-2.0
datasets:
  - ILSVRC/imagenet-1k
model-index:
  - name: MaskBit-Tokenizer-14bits
    results:
      - task:
          type: image-generation
        dataset:
          name: ILSVRC/imagenet-1k
          type: ILSVRC/imagenet-1k
        metrics:
          - name: rFID
            type: rFID
            value: 1.37
          - name: InceptionScore
            type: InceptionScore
            value: 190.3
          - name: LPIPS
            type: LPIPS
            value: 0.286
          - name: PSNR
            type: PSNR
            value: 21.5
          - name: SSIM
            type: SSIM
            value: 0.56
          - name: CodebookUsage
            type: CodebookUsage
            value: 1

This model is the MaskBit tokenizer with a vocabulary size of 14bits. It uses a downsampling factor of 16 and is trained on ImageNet for images of resolution 256.

You can find more details on the project page and in the paper.