Created basic README.md
Browse files
README.md
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
The Dolly v2 model but sharded to reduce the CPU memory required to load it on the GPU and also reduce the load time. By sharding it
|
2 |
+
you don't need to load the full model on the CPU RAM before sending it to the GPU, just load it shard by shard.
|