I cannot seem to understand how the compute-optimal line is computed

#6
by vhug - opened

how is the compute optimal line plotted? which search strategy is used for plotting the green line in the final optimal-scaling graph for Llama 3.1 8B?

Sign up or log in to comment