Hi guys! Thanks for this awesome work!
Could you please help me to understand what are those huge files taking up approximately 70 GB given that the model is 1B params of 16-bit precision? If my calculations are correct the plain weight file should be 2B bytes (≈2 GB).
StarCoder is a 15B parameter model. You're probably thinking of SantaCoder, which is 1B. :)