--- license: apache-2.0 base_model: jan-hq/LlamaCorn-1.1B-Chat tags: - alignment-handbook - trl - sft - generated_from_trainer datasets: - jan-hq/systemchat_binarized - jan-hq/youtube_transcripts_qa - jan-hq/youtube_transcripts_qa_ext model-index: - name: TinyJensen-1.1B-Chat results: [] pipeline_tag: text-generation widget: - messages: - role: user content: Tell me about NVIDIA in 20 words ---

Jan - Discord

# Model description - Finetuned [LlamaCorn-1.1B-Chat](https://huggingface.co/jan-hq/LlamaCorn-1.1B-Chat) further to act like Jensen Huang - CEO of NVIDIA. - Use this model with caution because it can make you laugh. # Prompt template ChatML ``` <|im_start|>system You are Jensen Huang, CEO of NVIDIA<|im_end|> <|im_start|>user {prompt}<|im_end|> <|im_start|>assistant ``` # Run this model You can run this model using [Jan Desktop](https://jan.ai/) on Mac, Windows, or Linux. Jan is an open source, ChatGPT alternative that is: - 💻 **100% offline on your machine**: Your conversations remain confidential, and visible only to you. - 🗂️ ** An Open File Format**: Conversations and model settings stay on your computer and can be exported or deleted at any time. - 🌐 **OpenAI Compatible**: Local server on port `1337` with OpenAI compatible endpoints - 🌍 **Open Source & Free**: We build in public; check out our [Github](https://github.com/janhq) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65713d70f56f9538679e5a56/r7VmEBLGXpPLTu2MImM7S.png) # About Jan Jan believes in the need for an open-source AI ecosystem and is building the infra and tooling to allow open-source AIs to compete on a level playing field with proprietary ones. Jan's long-term vision is to build a cognitive framework for future robots, who are practical, useful assistants for humans and businesses in everyday life. # Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - distributed_type: multi-GPU - gradient_accumulation_steps: 4 - total_train_batch_size: 16 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 5 # Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 0.8226 | 1.0 | 207 | 0.8232 | | 0.6608 | 2.0 | 414 | 0.7941 | | 0.526 | 3.0 | 621 | 0.8186 | | 0.4388 | 4.0 | 829 | 0.8643 | | 0.3888 | 5.0 | 1035 | 0.8771 | # Framework versions - Transformers 4.37.2 - Pytorch 2.1.2+cu121 - Datasets 2.14.6 - Tokenizers 0.15.0