spellfix
Browse files
README.md
CHANGED
@@ -18,7 +18,7 @@ tags:
|
|
18 |
|
19 |
This repo contains llamafile format model files for [Maykeye/TinyLLama-v0](https://huggingface.co/Maykeye/TinyLLama-v0) that is a recreation of [roneneldan/TinyStories-1M](https://huggingface.co/roneneldan/TinyStories-1M) which was part of this very interesting research paper called [TinyStories: How Small Can Language Models Be and Still Speak Coherent English?](https://arxiv.org/abs/2305.07759) by Ronen Eldan and Yuanzhi Li.
|
20 |
|
21 |
-
In the paper this is
|
22 |
|
23 |
> Language models (LMs) are powerful tools for natural language processing, but they often struggle to produce coherent and fluent text when they are small. Models with around 125M parameters such as GPT-Neo (small) or GPT-2 (small) can rarely generate coherent and consistent English text beyond a few words even after extensive training. This raises the question of whether the emergence of the ability to produce coherent English text only occurs at larger scales (with hundreds of millions of parameters or more) and complex architectures (with many layers of global attention).
|
24 |
|
@@ -28,9 +28,9 @@ In the paper this is there abstract
|
|
28 |
|
29 |
> We hope that TinyStories can facilitate the development, analysis and research of LMs, especially for low-resource or specialized domains, and shed light on the emergence of language capabilities in LMs.
|
30 |
|
31 |
-
Maykeye's replication effort while didn't get down to 1M parameters, Maykeye did get
|
32 |
|
33 |
-
Anyway, this
|
34 |
|
35 |
## Usage In Linux
|
36 |
|
|
|
18 |
|
19 |
This repo contains llamafile format model files for [Maykeye/TinyLLama-v0](https://huggingface.co/Maykeye/TinyLLama-v0) that is a recreation of [roneneldan/TinyStories-1M](https://huggingface.co/roneneldan/TinyStories-1M) which was part of this very interesting research paper called [TinyStories: How Small Can Language Models Be and Still Speak Coherent English?](https://arxiv.org/abs/2305.07759) by Ronen Eldan and Yuanzhi Li.
|
20 |
|
21 |
+
In the paper this is their abstract
|
22 |
|
23 |
> Language models (LMs) are powerful tools for natural language processing, but they often struggle to produce coherent and fluent text when they are small. Models with around 125M parameters such as GPT-Neo (small) or GPT-2 (small) can rarely generate coherent and consistent English text beyond a few words even after extensive training. This raises the question of whether the emergence of the ability to produce coherent English text only occurs at larger scales (with hundreds of millions of parameters or more) and complex architectures (with many layers of global attention).
|
24 |
|
|
|
28 |
|
29 |
> We hope that TinyStories can facilitate the development, analysis and research of LMs, especially for low-resource or specialized domains, and shed light on the emergence of language capabilities in LMs.
|
30 |
|
31 |
+
Maykeye's replication effort while didn't get down to 1M parameters, Maykeye did get down to 5M parameters which is still quite an achievement (in so far as known replication effort has shown so far).
|
32 |
|
33 |
+
Anyway, this conversion to llamafile should give you an easy way to give this model a shot and also of the whole llamafile ecosystem in general (as it's quite quite small compared to other larger chat capable models). As a tradeoff however, this is more of a text generation model, so while it will open up a webserver as part of llamafile, it would not chat with you as expected. Instead you would give it a story prompt and it will generate a story for you. Don't expect any great stories for this size however, but it's an interesting demo on how small you can squeeze AI models and still have it generate recognisable english.
|
34 |
|
35 |
## Usage In Linux
|
36 |
|