spidogan
/

llm-course-hw1

Text Generation

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions Community

llm-course-hw1 / config.json

spidogan's picture

Added small model with RoPE and MHLA

b600607 verified 9 days ago

history blame contribute delete

165 Bytes

	{
	"dropout": 0.1,
	"hidden_dim": 768,
	"intermediate_dim": 3072,
	"max_seq_len": 128,
	"n_head": 12,
	"n_kv_head": 12,
	"n_layer": 12,
	"vocab_size": 1024
	}