Contains two branches: sharded and combined.
sharded
combined
sharded has been saved via FSDP on two nodes, combined is the non-sharded weights version