File size: 2,135 Bytes
130bbe3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
de1073d
130bbe3
 
 
 
 
de1073d
 
 
 
 
 
 
 
 
 
 
130bbe3
de1073d
 
 
 
130bbe3
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
---
license: apache-2.0
datasets:
- HuggingFaceFW/fineweb
language:
- en
library_name: transformers
tags:
- IoT
- sensor
- embedded
---

# TinyLLM

## Overview

This repository hosts a small language model developed as part of the TinyLLM framework ([arxiv link]). These models are specifically designed and fine-tuned with sensor data to support embedded sensing applications. They enable locally hosted language models on low-computing-power devices, such as single-board computers. The models, based on the GPT-2 architecture, are trained using Nvidia's H100 GPUs. This repo provides base models that can be further fine-tuned for specific downstream tasks related to embedded sensing.
## Model Information

- **Parameters:** 101M (Hidden Size = 704)
- **Architecture:** Decoder-only transformer
- **Training Data:** Up to 10B tokens from the [SHL](http://www.shl-dataset.org/) and [Fineweb](https://huggingface.co/datasets/HuggingFaceFW/fineweb) datasets, combined in a 4:6 ratio
- **Input and Output Modality:** Text
- **Context Length:** 1024

## Acknowledgements

We want to acknowledge the open-source frameworks [llm.c](https://github.com/karpathy/llm.c) and [llama.cpp](https://github.com/ggerganov/llama.cpp) and the sensor dataset provided by SHL, which were instrumental in training and testing these models.

## Usage

The model can be used in two primary ways:
1. **With Hugging Face’s Transformers Library**
   ```python
   from transformers import pipeline
   import torch
    
   path = "tinyllm/101M-0.4"
   prompt = "The sea is blue but it's his red sea"
    
   generator = pipeline("text-generation", model=path,max_new_tokens = 30, repetition_penalty=1.3, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto")
   print(generator(prompt)[0]['generated_text'])
   ```

2. **With llama.cpp**
  Generate a GGUF model file using this [tool](https://github.com/ggerganov/llama.cpp/blob/master/convert_hf_to_gguf.py) and use the generated GGUF file for inferencing.
    ```python
    python3 convert_hf_to_gguf.py models/mymodel/
    ```

## Disclaimer

This model is intended solely for research purposes.