README.md · prithivMLmods/Acrux-500M-o1-Journey at c047dbcfac3107c11489e8a23e8561e3f9db6364

metadata

license: creativeml-openrail-m
datasets:
  - GAIR/o1-journey
language:
  - en
base_model:
  - Qwen/Qwen2.5-0.5B-Instruct
library_name: transformers
pipeline_tag: text-generation
tags:
  - Qwen2.5
  - Llama-Cpp
  - CoT
  - o1-journey
  - text-generation-inference
  - safetensors
  - Ollama

Acrux-500M-o1-Journey Model Files

The Acrux-500M-o1-Journey is a lightweight, instruction-tuned language model fine-tuned from the Qwen2.5-0.5B-Instruct base model. With a size of 500 million parameters, it is designed for cost-effective deployment and fast text generation while maintaining quality performance for instruction-following tasks.

File Name	Size	Description	Upload Status
`.gitattributes`	1.57 kB	Git attributes for managing LFS files.	Uploaded
`README.md`	195 Bytes	Model overview or documentation.	Updated
`added_tokens.json`	657 Bytes	Custom tokens for the tokenizer.	Uploaded
`config.json`	859 Bytes	Model configuration file.	Uploaded
`generation_config.json`	280 Bytes	Configuration for text generation.	Uploaded
`merges.txt`	1.82 MB	Merge rules for byte-pair encoding (BPE).	Uploaded
`pytorch_model.bin`	988 MB	Model weights (PyTorch format).	Uploaded (LFS)
`special_tokens_map.json`	644 Bytes	Mapping for special tokens.	Uploaded
`tokenizer.json`	11.4 MB	Full tokenizer configuration.	Uploaded (LFS)
`tokenizer_config.json`	7.73 kB	Additional tokenizer settings.	Uploaded
`vocab.json`	2.78 MB	Vocabulary for the tokenizer.	Uploaded

Key Features:

Compact Size with Efficient Performance:
The smaller parameter count (500M) ensures faster inference and reduced hardware requirements.
Instruction Optimization:
Fine-tuned to follow prompts effectively, making it suitable for interactive applications and prompt-based tasks.
Domain-Specific Training:
Trained on the GAIR/o1-journey dataset, providing tailored capabilities for specific use cases.

Training Details:

Base Model: Qwen2.5-0.5B-Instruct
Dataset Used for Fine-Tuning: GAIR/o1-journey
- A compact dataset focusing on instruction-driven generation with 1.42k samples.

Capabilities:

Instruction Following:
- Generates accurate and coherent responses to user instructions.
- Handles summarization, question-answering, and conversational tasks.
Fast Inference:
- Ideal for real-time applications due to reduced latency from its smaller size.
Interactive AI Development:
- Suitable for chatbots, virtual assistants, and instructional interfaces.

Usage Instructions:

Setup:
Download all model files, ensuring compatibility with the Hugging Face Transformers library.

Loading the Model:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "prithivMLmods/Acrux-500M-o1-Journey"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

Sample Generate Text:

input_text = "Explain the concept of machine learning in simple terms."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Optimize Generation:
Adjust parameters in generation_config.json for better control of output, such as:
- temperature for randomness.
- top_p for sampling diversity.
- max_length for output size.