jbochi's picture
Add base model to table
666f167
metadata
license: cc-by-nc-4.0
datasets:
  - grammarly/coedit
language:
  - en
tags:
  - text-generation-inference
  - candle
widget:
  - text: >-
      Fix the grammar: When I grow up, I start to understand what he said is
      quite right.
    example_title: Fluency
  - text: >-
      Make this text coherent: Their flight is weak. They run quickly through
      the tree canopy.
    example_title: Coherence
  - text: >-
      Rewrite to make this easier to understand: A storm surge is what
      forecasters consider a hurricane's most treacherous aspect.
    example_title: Simplification
  - text: 'Paraphrase this: Do you know where I was born?'
    example_title: Paraphrase
  - text: >-
      Write this more formally: omg i love that song im listening to it right
      now
    example_title: Formalize
  - text: 'Write in a more neutral way: The authors'' exposé on nutrition studies.'
    example_title: Neutralize

Quantized candle weights for the CoEdIT model

Quantized weights of CoEdIT for inference with candle.

Usage

You can run the smaller models directly from the browser using this space.

Clone candle, and run the quantized-t5 example:

$ cargo run --example quantized-t5 --release  -- \
  --model-id "jbochi/candle-coedit-quantized" \
  --prompt "Make this text coherent: Their flight is weak. They run quickly through the tree canopy." \
  --temperature 0
...
 Although their flight is weak, they run quickly through the tree canopy.

By default, it will use CoEdIT-large with q6k quantization (770M params, 643 MB).

To use CoEdIT-xl (3B params, 2.34 GB), or any other provided model, specify the weight-file and config-file:

$ cargo run --example quantized-t5 --release  -- \
  --model-id "jbochi/candle-coedit-quantized" \
  --weight-file "model-xl.gguf" \
  --config-file "config-xl.json" \
  --prompt "Rewrite to make this easier to understand: Note that a storm surge is what forecasters consider a hurricane's most treacherous aspect." \
  --temperature 0
...
 Note that a storm surge is what forecasters consider a hurricane's most dangerous part.

Models available

These are all the available formats. Weight file is named {model}.gguf and the config file is config-{base_model}.json

Model Base model Quantization # Params Size
- small (unofficial) None 77M 308 MB
model-small small 6k 77M 78.2 MB
model-small-q4k small 4k 77M 59.6 MB
model-small-q4_0 small 4_0 77M 43.4 MB
- base (unofficial) None 248M 990 MB
model-base base 6k 248M 194M
model-base-q4k base 4k 248M 133M
model-base-q4_0 base 4_0 248M 133M
- large None 770M 3.13 GB
model large 6k 770M 643 MB
model-q4k large 4k 770M 441 MB
model-q4_0 large 4_0 770M 441 MB
- xl None 3B 11.4 GB
model-xl xl 6k 3B 2.34 GB
model-xl-q4k xl 4k 3B 1.6 GB
model-xl-q4_0 xl 4_0 3B 1.6 GB
- xxl None 11B 44.5 GB
model-xxl xxl 6k 11B 9.14 GB
model-xxl-q4k xxl 4k 11B 6.27 GB
model-xxl-q4_0 xxl 4_0 11B 6.27 GB

Model generation

The weights were quantized using candle:

cargo run --example tensor-tools --release -- quantize \
  --quantization q6k \
  /path/to/coedit-<version>/model.safetensors \
  --out-file model<version>.gguf