Spaces:
Runtime error
Runtime error
Commit
•
8039890
1
Parent(s):
a5e29e6
Update app.py
Browse files
app.py
CHANGED
@@ -123,15 +123,14 @@ with gr.Blocks(theme='NoCrypt/miku') as demo:
|
|
123 |
""")
|
124 |
with gr.Tab("Memory Calculation"):
|
125 |
#with gr.TabItem("Memory Calculation"):
|
126 |
-
|
127 |
-
|
128 |
-
|
129 |
-
|
130 |
-
|
131 |
-
|
132 |
-
|
133 |
-
|
134 |
-
""")
|
135 |
with gr.Accordion("How to use it?", open=False):
|
136 |
gr.Markdown("""
|
137 |
## To Use
|
|
|
123 |
""")
|
124 |
with gr.Tab("Memory Calculation"):
|
125 |
#with gr.TabItem("Memory Calculation"):
|
126 |
+
gr.Markdown("""
|
127 |
+
## Memory Calculation
|
128 |
+
|
129 |
+
Memory Calculation calculates the amount of device memory required to train or infer a model. See [Transformers Math 101](https://blog.eleuther.ai/transformer-math/) for more details on how memory overhead is calculated.
|
130 |
+
Take this estimation with a grain of salt, because every implementation is different and these calculations were written to match the GPT-NeoX library as close as possible.
|
131 |
+
Even for other training and inference libraries, however, we expect our script to give approximate memory estimations within acceptable error.
|
132 |
+
(Please see [LLM finetuning memory requirements](https://blog.scottlogic.com/2023/11/24/llm-mem.html) for a treatment of how specific memory costs may vary framework-to-framework). Other good resources that we consulted are the [ZeRO Paper](https://arxiv.org/abs/1910.02054) and [Reducing Activation Recomputation in Large Transformer Models](https://arxiv.org/pdf/2205.05198.pdf).
|
133 |
+
""")
|
|
|
134 |
with gr.Accordion("How to use it?", open=False):
|
135 |
gr.Markdown("""
|
136 |
## To Use
|