eliebak HF staff commited on
Commit
31d2d76
·
verified ·
1 Parent(s): bc52030

add ressources 1

Browse files

Bunch of ressources that i like, will create a discussion tab to encourage people to suggest the best ressources when we release the blog

Files changed (1) hide show
  1. src/index.html +51 -1
src/index.html CHANGED
@@ -2380,6 +2380,11 @@
2380
  <a href="https://arxiv.org/abs/2312.11805"><strong>Gemini</strong></a>
2381
  <p>Presents Google's multimodal model architecture capable of processing text, images, audio, and video inputs.</p>
2382
  </div>
 
 
 
 
 
2383
 
2384
  <div>
2385
  <a href="https://arxiv.org/abs/2412.19437v1"><strong>DeepSeek-V3</strong></a>
@@ -2388,7 +2393,6 @@
2388
 
2389
 
2390
  <h3>Training Frameworks</h3>
2391
-
2392
  <div>
2393
  <a href="https://github.com/facebookresearch/fairscale/tree/main"><strong>FairScale</strong></a>
2394
  <p>PyTorch extension library for large-scale training, offering various parallelism and optimization techniques.</p>
@@ -2441,6 +2445,11 @@
2441
  <p>Comprehensive guide to understanding and optimizing GPU memory usage in PyTorch.</p>
2442
  </div>
2443
 
 
 
 
 
 
2444
  <div>
2445
  <a href="https://pytorch.org/tutorials/intermediate/tensorboard_profiler_tutorial.html"><strong>TensorBoard Profiler Tutorial</strong></a>
2446
  <p>Guide to using TensorBoard's profiling tools for PyTorch models.</p>
@@ -2502,6 +2511,11 @@
2502
  <a href="https://arxiv.org/abs/1710.03740"><strong>Mixed precision training</strong></a>
2503
  <p>Introduces mixed precision training techniques for deep learning models.</p>
2504
  </div>
 
 
 
 
 
2505
 
2506
  <h3>Hardware</h3>
2507
 
@@ -2519,6 +2533,11 @@
2519
  <a href="https://www.semianalysis.com/p/100000-h100-clusters-power-network"><strong>Semianalysis - 100k H100 cluster</strong></a>
2520
  <p>Analysis of large-scale H100 GPU clusters and their implications for AI infrastructure.</p>
2521
  </div>
 
 
 
 
 
2522
 
2523
  <h3>Others</h3>
2524
 
@@ -2546,6 +2565,37 @@
2546
  <a href="https://www.harmdevries.com/post/context-length/"><strong>Harm's blog for long context</strong></a>
2547
  <p>Investigation into long context training in terms of data and training cost.</p>
2548
  </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2549
 
2550
  <h2>Appendix</h2>
2551
 
 
2380
  <a href="https://arxiv.org/abs/2312.11805"><strong>Gemini</strong></a>
2381
  <p>Presents Google's multimodal model architecture capable of processing text, images, audio, and video inputs.</p>
2382
  </div>
2383
+
2384
+ <div>
2385
+ <a href="https://arxiv.org/abs/2407.21783"><strong>Llama 3</strong></a>
2386
+ <p>The Llama 3 Herd of Models</p>
2387
+ </div>
2388
 
2389
  <div>
2390
  <a href="https://arxiv.org/abs/2412.19437v1"><strong>DeepSeek-V3</strong></a>
 
2393
 
2394
 
2395
  <h3>Training Frameworks</h3>
 
2396
  <div>
2397
  <a href="https://github.com/facebookresearch/fairscale/tree/main"><strong>FairScale</strong></a>
2398
  <p>PyTorch extension library for large-scale training, offering various parallelism and optimization techniques.</p>
 
2445
  <p>Comprehensive guide to understanding and optimizing GPU memory usage in PyTorch.</p>
2446
  </div>
2447
 
2448
+ <div>
2449
+ <a href="https://huggingface.co/blog/train_memory"><strong>Memory profiling walkthrough on a simple example</strong></a>
2450
+ <p>Visualize and understand GPU memory in PyTorch.</p>
2451
+ </div>
2452
+
2453
  <div>
2454
  <a href="https://pytorch.org/tutorials/intermediate/tensorboard_profiler_tutorial.html"><strong>TensorBoard Profiler Tutorial</strong></a>
2455
  <p>Guide to using TensorBoard's profiling tools for PyTorch models.</p>
 
2511
  <a href="https://arxiv.org/abs/1710.03740"><strong>Mixed precision training</strong></a>
2512
  <p>Introduces mixed precision training techniques for deep learning models.</p>
2513
  </div>
2514
+
2515
+ <div>
2516
+ <a href="https://main-horse.github.io/posts/visualizing-6d/"><strong>@main_horse blog</strong></a>
2517
+ <p>Visualizing 6D Mesh Parallelism</p>
2518
+ </div>
2519
 
2520
  <h3>Hardware</h3>
2521
 
 
2533
  <a href="https://www.semianalysis.com/p/100000-h100-clusters-power-network"><strong>Semianalysis - 100k H100 cluster</strong></a>
2534
  <p>Analysis of large-scale H100 GPU clusters and their implications for AI infrastructure.</p>
2535
  </div>
2536
+
2537
+ <div>
2538
+ <a href="https://modal.com/gpu-glossary/readme"><strong>Modal GPU Glossary </strong></a>
2539
+ <p>CUDA docs for human</p>
2540
+ </div>
2541
 
2542
  <h3>Others</h3>
2543
 
 
2565
  <a href="https://www.harmdevries.com/post/context-length/"><strong>Harm's blog for long context</strong></a>
2566
  <p>Investigation into long context training in terms of data and training cost.</p>
2567
  </div>
2568
+
2569
+ <div>
2570
+ <a href="https://www.youtube.com/@GPUMODE/videos"><strong>GPU Mode</strong></a>
2571
+ <p>A GPU reading group and community.</p>
2572
+ </div>
2573
+
2574
+ <div>
2575
+ <a href="https://youtube.com/playlist?list=PLvtrkEledFjqOLuDB_9FWL3dgivYqc6-3&si=fKWPotx8BflLAUkf"><strong>EleutherAI Youtube channel</strong></a>
2576
+ <p>ML Scalability & Performance Reading Group</p>
2577
+ </div>
2578
+
2579
+ <div>
2580
+ <a href="https://jax-ml.github.io/scaling-book/"><strong>Google Jax Scaling book</strong></a>
2581
+ <p>How to Scale Your Model</p>
2582
+ </div>
2583
+
2584
+ <div>
2585
+ <a href="https://github.com/facebookresearch/capi/blob/main/fsdp.py"><strong>@fvsmassa & @TimDarcet FSDP</strong></a>
2586
+ <p>Standalone ~500 LoC FSDP implementation</p>
2587
+ </div>
2588
+
2589
+ <div>
2590
+ <a href="https://www.thonking.ai/"><strong>thonking.ai</strong></a>
2591
+ <p>Some of Horace He blogpost</p>
2592
+ </div>
2593
+
2594
+ <div>
2595
+ <a href="https://gordicaleksa.medium.com/eli5-flash-attention-5c44017022ad"><strong>Aleksa's ELI5 Flash Attention</strong></a>
2596
+ <p>Easy explanation of Flash Attention</p>
2597
+ </div>
2598
+
2599
 
2600
  <h2>Appendix</h2>
2601