JRosenkranz commited on
Commit
a24b598
1 Parent(s): 2532bba

added minimal description

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -2,6 +2,15 @@
2
  license: llama2
3
  ---
4
 
 
 
 
 
 
 
 
 
 
5
  To try this out running in a production-like environment, please use the pre-built docker image:
6
 
7
  ```bash
 
2
  license: llama2
3
  ---
4
 
5
+ ## Description
6
+
7
+ This model as intended to be used as an accelerator for llama 13B (chat).
8
+
9
+
10
+ Undlerlying implementation of Paged Attention KV-Cached and speculator can be found in https://github.com/foundation-model-stack/fms-extras
11
+ Production implementation using `fms-extras` implementation can be found in https://github.com/tdoublep/text-generation-inference/tree/speculative-decoding
12
+
13
+
14
  To try this out running in a production-like environment, please use the pre-built docker image:
15
 
16
  ```bash