Submitted by Richard ZHou 79 Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps RTP-LLM 1