--- license: mit tags: - mamba - pytorch - Test Generation - research abstract datasets: pt-sk/research_papers_short metrics: CrossEntropyLoss --- This model uses Mamba Architecture trained on a research abstract dataset. * Optimizer: AdamW * Leanring Rate: 0.001 Import the scripts from the code folder ``` from model import Mamba, ModelArgs ``` Loading Model ``` mamba_model = Mamba.from_pretrained("pt-sk/mamba").to("cuda") ``` Loading Tokenizer ``` tokenizer = AutoTokenizer.from_pretrained('pt-sk/mamba') ``` mamba_reserach file contains the state dict of optimizer and the model.