Spaces:

training-transformers-together
/

calc

Runtime error

App Files Files Community

justheuristic commited on Dec 6, 2021

Commit

06a624f

•

1 Parent(s): 4ee0173

plot tweaks

Browse files

Files changed (3) hide show

app.py +11 -7
charts.py +2 -1
st_helpers.py +1 -0

app.py CHANGED Viewed

@@ -34,7 +34,7 @@ All it takes is for a bunch of us to come together. In fact, we're doing it righ
 draw_current_progress()
 content_text(f"""
-We're training a model similar to {cite("OpenAI DALL-E", "https://openai.com/blog/dall-e/")},
 that is, a transformer "language model" that generates images from text description.
 It is trained on {cite("LAION-400M", "https://laion.ai/laion-400-open-dataset/")},
 the world's largest openly available image-text-pair dataset with 400 million samples. Our model is based on
@@ -47,12 +47,12 @@ with st.expander("How to train efficiently over the internet?"):
     content_text(f"""
 Modern distributed training algorithms are designed for HPC networks with 10-100 gigabit per second bandwidth.
 In turn, a typical Internet connection runs at 10-100 megabits per second: that’s three orders of magnitude slower.
-To make distributed training over the Internet efficient, you need to win back these three orders of magnitude.
 """)
     content_text(f"""
-This may seem daunting at first, but in reality, DL researchers have already made all the necessary pieces for solving this puzzle:
 <table style="border: 0px;"><tbody style="border: 0px;">
-<tr><td> Speed-up (AllReduce)<br> </td> <td>Existing technique</td></tr>
 <tr><td class=centered><strong>4-16x</strong></td><td>
   <strong>Large-batch training:</strong> {cite("You et al. (2019)", "https://arxiv.org/abs/1904.00962")} proposed a way for training neural networks efficiently with larger batches, and hence, fewer communication rounds.
 </td></tr>
@@ -77,12 +77,16 @@ This may seem daunting at first, but in reality, DL researchers have already mad
 </td></tr>
 </tbody></table>
 """)
 content_title("How do I join?")
-content_text("""
-That's easy. First, make sure you're logged in at Hugging Face. If you don't have an account, create one <b>TODO</b>.<br>
 <ul style="text-align: left; list-style-position: inside; margin-top: 12px; margin-left: -24px;">
     <li style="margin-top: 4px;">

 draw_current_progress()
 content_text(f"""
+For this demo we train a model similar to {cite("OpenAI DALL-E", "https://openai.com/blog/dall-e/")},
 that is, a transformer "language model" that generates images from text description.
 It is trained on {cite("LAION-400M", "https://laion.ai/laion-400-open-dataset/")},
 the world's largest openly available image-text-pair dataset with 400 million samples. Our model is based on
     content_text(f"""
 Modern distributed training algorithms are designed for HPC networks with 10-100 gigabit per second bandwidth.
 In turn, a typical Internet connection runs at 10-100 megabits per second: that’s three orders of magnitude slower.
+To make distributed training efficient, you need to win back these three orders of magnitude.
+This may seem daunting at first, but in reality, DL researchers have already made all the necessary pieces for solving this puzzle:
 """)
     content_text(f"""
 <table style="border: 0px;"><tbody style="border: 0px;">
+<tr><td> Speed&#8209;up <br> </td> <td>How to achieve</td></tr>
 <tr><td class=centered><strong>4-16x</strong></td><td>
   <strong>Large-batch training:</strong> {cite("You et al. (2019)", "https://arxiv.org/abs/1904.00962")} proposed a way for training neural networks efficiently with larger batches, and hence, fewer communication rounds.
 </td></tr>
 </td></tr>
 </tbody></table>
 """)
+    content_text("""
+    These techniques are already more than enough to cover 1000x slower communication (totalling to 655.
+     and choose which techniques to use. In this demo, we use parameter sharing to reduce the number of parameters by
+      roughly 12x. If you don’t want parameter sharing, you can instead use more advanced gradient compression or larger batches.
+    """)
 content_title("How do I join?")
+content_text(f"""
+That's easy. First, make sure you're logged in at Hugging Face. If you don't have an account, create one {cite("here", "https://huggingface.co/join")}.<br>
 <ul style="text-align: left; list-style-position: inside; margin-top: 12px; margin-left: -24px;">
     <li style="margin-top: 4px;">

charts.py CHANGED Viewed

@@ -11,6 +11,7 @@ def draw_current_progress():
     st.vega_lite_chart(
         source, {
             "height": 200,
             "title": {
                 "text": "Training DALL-E with volunteers (updated every few minutes during NeurIPS 2021)",
                 "dy": 6,
@@ -36,7 +37,7 @@ def draw_current_progress():
                 },
             ],
         },
-        use_container_width=True,
     )

     st.vega_lite_chart(
         source, {
             "height": 200,
+            "width": 600,
             "title": {
                 "text": "Training DALL-E with volunteers (updated every few minutes during NeurIPS 2021)",
                 "dy": 6,
                 },
             ],
         },
+        use_container_width=False,  # breaks on <600px screens
     )

st_helpers.py CHANGED Viewed

@@ -50,5 +50,6 @@ def content_text(text: str, vspace_before: int = 0, vspace_after: int = 0):
                 f'{text}</div><center>',
                 unsafe_allow_html=True)
 def cite(tag, link):
     return f"""<a target="_blank" rel="noopener noreferrer" href="{link}">{tag}</a>"""

                 f'{text}</div><center>',
                 unsafe_allow_html=True)
 def cite(tag, link):
     return f"""<a target="_blank" rel="noopener noreferrer" href="{link}">{tag}</a>"""