--- license: cc-by-sa-3.0 language: - en pipeline_tag: text-generation tags: - csharp - mpt - instruct - 1b - llm - .net --- Upsides: - similar in quality (slightly worse) for C# code generation and explanation as 7b [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S), - 1b params size (2.6gb, bfloat16 finetuned), - 6x smaller, - 4x+ faster Downsides: - Sometimes, sufferers from response repetition-reiteration-not-ending when outputting for general discussion questions - Slightly worse in code generation than 7b model - No GGML/LLAMA.cpp running on CPU support yet Based on [mosaicml/mpt-1b-redpajama-200b-dolly](https://huggingface.co/mosaicml/mpt-1b-redpajama-200b-dolly) Same data sources as in [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S) Usage example: ```python import os from glob import glob import torch import transformers from transformers import PreTrainedTokenizerFast from transformers import AutoTokenizer out_name = "Nethermind/Mpt-Instruct-DotNet-XS" model = transformers.AutoModelForCausalLM.from_pretrained( out_name, torch_dtype=torch.bfloat16, trust_remote_code=True, ) model.to('cuda:0') model.eval() from markdownify import markdownify as md from bs4 import BeautifulSoup from IPython.display import display, Markdown tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b") tokenizer.pad_token = tokenizer.eos_token INSTRUCTION_KEY = "### Instruction:" RESPONSE_KEY = "### Response:" PROMPT_FOR_GENERATION_FORMAT = """{system} {instruction_key} {instruction} {response_key} """.format( system="{system}", instruction_key=INSTRUCTION_KEY, instruction="{instruction}", response_key=RESPONSE_KEY, response="{response}" ) def output_loop(input_tokens, steps=2000): print(input_tokens.shape[1], 2000 - input_tokens.shape[1] ) input_tokens = model.generate(input_tokens.to('cuda:0'), max_new_tokens=min(512, 1024 - input_tokens.shape[1]), do_sample=False, top_k=1, top_p=0.95) return input_tokens def give_answer(instruction="Create a loop over [0, 6, 7 , 77] that prints its contentrs", system="Below is an instruction that describes a task. Write a response that appropriately completes the request.", ): question = PROMPT_FOR_GENERATION_FORMAT.format(system=system, instruction=instruction) tokenized_question = tokenizer.encode(question ,return_tensors='pt') outputs = output_loop(tokenized_question) answer = tokenizer.batch_decode(outputs, skip_special_tokens=True) print(answer) return answer give_answer("What is the main difference between a struct and a class in C#?") ``` outputs: ``` A struct is a value type, which means it can only hold a few values. It is often used as a placeholder for other data types. A class, on the other hand, is a reference type, which means it can hold references to other data types. ``` On RTX 4090 new token sizes: - 2sec for 128 tokens - 5sec for 256 tokens - 11sec for 512 tokens Code generation: prompt: > Generate code to answer the question. > > How would you retrieve and analyse the fee history for the last 100 blocks and determine the average gas price? Example of code output: ```csharp public async Task GetFeeHistoryGasPriceAverage() { // Get the fee history ResultWrapper result = await _ethRpc.eth_feeHistory(100, BlockParameter.Latest, new double[] { 50, 75, 90 }); // Check if the API call succeeded if (result.Result!= Result.Success) { throw new Exception("Failed to retrieve fee history"); } // Get the gas price average decimal averageGasPrice = result.Data.BaseFeePerGas.Average(); return averageGasPrice; } ```