"""
This specific file was bodged together by ham-handed hedgehogs. If something looks wrong, it's because it is.
If you're not a hedgehog, you shouldn't reuse this code. Use this instead: https://docs.streamlit.io/library/get-started
"""
import streamlit as st
from st_helpers import make_header, content_text, content_title, cite, make_footer, make_tabs
from charts import draw_current_progress
st.set_page_config(page_title="Training Transformers Together", layout="centered")
st.markdown("## Full demo content will be posted here on December 7th!")
make_header()
content_text(f"""
There was a time when you could comfortably train state-of-the-art vision and language models at home on your workstation.
The first convolutional neural net to beat ImageNet
(AlexNet)
was trained for 5-6 days on two gamer-grade GPUs. In contrast, today's TOP-1 ImageNet model
(CoAtNet)
takes 20,000 TPU-v3 days. And things are even worse in the NLP world: training
GPT‑3 on a top-tier server
with 8x A100 would take decades.""")
content_text(f"""
So, can individual researchers and small labs still train state-of-the-art? Yes we can!
All it takes is for a bunch of us to come together. In fact, we're doing it right now and you're invited to join!
""", vspace_before=12)
draw_current_progress()
content_text(f"""
We're training a model similar to OpenAI DALL-E,
that is, a transformer "language model" that generates images from text description.
It is trained on LAION-400M,
the world's largest openly available image-text-pair dataset with 400 million samples. Our model is based on
the dalle‑pytorch implementation
by Phil Wang with several tweaks for memory-efficient training.""")
content_title("How do I join?")
content_text("""
That's easy. First, make sure you're logged in at Hugging Face. If you don't have an account, create one TODO.