Michael Benayoun
michaelbenayoun
AI & ML interests
None yet
Articles
Organizations
Collections
1
Papers and resources related to distributed training.
-
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
Paper • 2304.11277 • Published • 1 -
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Paper • 1909.08053 • Published • 2 -
Reducing Activation Recomputation in Large Transformer Models
Paper • 2205.05198 • Published -
GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
Paper • 1811.06965 • Published
models
9
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1615890856777-6047a3315da6ba4b1dfb9e18.png)
michaelbenayoun/llama-2-tiny-4kv-heads-2layers-random
Feature Extraction
•
Updated
•
187
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1615890856777-6047a3315da6ba4b1dfb9e18.png)
michaelbenayoun/llama-2-tiny-4kv-heads-8layers-random
Feature Extraction
•
Updated
•
459
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1615890856777-6047a3315da6ba4b1dfb9e18.png)
michaelbenayoun/llama-2-tiny-4kv-heads-4layers-random
Feature Extraction
•
Updated
•
3.51k
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1615890856777-6047a3315da6ba4b1dfb9e18.png)
michaelbenayoun/llama-2-tiny-4kv-heads-16layers-random
Feature Extraction
•
Updated
•
3.12k
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1615890856777-6047a3315da6ba4b1dfb9e18.png)
michaelbenayoun/llama-2-tiny-16layers-random
Feature Extraction
•
Updated
•
4.34k
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1615890856777-6047a3315da6ba4b1dfb9e18.png)
michaelbenayoun/llama-2-tiny-16layers-32kv-heads-random
Feature Extraction
•
Updated
•
56
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1615890856777-6047a3315da6ba4b1dfb9e18.png)
michaelbenayoun/gpt-neox-tiny-4layers-random
Feature Extraction
•
Updated
•
381
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1615890856777-6047a3315da6ba4b1dfb9e18.png)
michaelbenayoun/mistral-tiny-4layers-8kv-heads-random
Text Generation
•
Updated
•
165
![](https://cdn-avatars.huggingface.co/v1/production/uploads/1615890856777-6047a3315da6ba4b1dfb9e18.png)
michaelbenayoun/llama-2-tiny-4layers-random
Text Generation
•
Updated
•
14
datasets
None public yet