Text Generation
Transformers
GGUF
English
Inference Endpoints
Edit model card

WikiChat-v0.2

Training in progress model to have conversations.

The GGUFs uploaded are full FP32 precision.

Using OpenOrca GPT-4 data + cosmopedia for some extra data + dolly15k for instruct

Model Details:

  • 83.59M parameters (83591800)
  • 8 attention heads
  • 40 layers
  • 384 embeddings size
  • 4096/8192/16384 context (please use 2/4x RoPE scaling, may train a 16k finetuned version later)
  • Batch size 16
  • llama.cpp (train-text-from-scratch)

Prompt Format (Alpaca):

Instruction: {system}
Input: {prompt}
Response: {response}

Please structure your prompts in an instruct format for maximum performance.

Training Details:

  • 1x RTX 3070 8GB (Infrencing speed: 80tok/s, full GPU offload)
  • 1x Ryzen 3 3700x
  • 96gb RAM
  • 10 iterations
  • Loss Target = 2.5 to 3.0
  • Approx 480 samples/1M train tokens (>0.0001 epoches)
  • Training data = Refer to OpenOrca page

Notes:

The model isn't ready yet; this is to test tokenization of OpenOrca and a balance between training speed and model size

Example output:

User: What is the square root of 4?
Assistant: The square root of 4 is 2.
Downloads last month
351
GGUF
+2

Datasets used to train leafspark/wikichat-v2