prithivMLmods's picture
Update README.md
c047dbc verified
|
raw
history blame
4.2 kB
metadata
license: creativeml-openrail-m
datasets:
  - GAIR/o1-journey
language:
  - en
base_model:
  - Qwen/Qwen2.5-0.5B-Instruct
library_name: transformers
pipeline_tag: text-generation
tags:
  - Qwen2.5
  - Llama-Cpp
  - CoT
  - o1-journey
  - text-generation-inference
  - safetensors
  - Ollama

Acrux-500M-o1-Journey Model Files

The Acrux-500M-o1-Journey is a lightweight, instruction-tuned language model fine-tuned from the Qwen2.5-0.5B-Instruct base model. With a size of 500 million parameters, it is designed for cost-effective deployment and fast text generation while maintaining quality performance for instruction-following tasks.

File Name Size Description Upload Status
.gitattributes 1.57 kB Git attributes for managing LFS files. Uploaded
README.md 195 Bytes Model overview or documentation. Updated
added_tokens.json 657 Bytes Custom tokens for the tokenizer. Uploaded
config.json 859 Bytes Model configuration file. Uploaded
generation_config.json 280 Bytes Configuration for text generation. Uploaded
merges.txt 1.82 MB Merge rules for byte-pair encoding (BPE). Uploaded
pytorch_model.bin 988 MB Model weights (PyTorch format). Uploaded (LFS)
special_tokens_map.json 644 Bytes Mapping for special tokens. Uploaded
tokenizer.json 11.4 MB Full tokenizer configuration. Uploaded (LFS)
tokenizer_config.json 7.73 kB Additional tokenizer settings. Uploaded
vocab.json 2.78 MB Vocabulary for the tokenizer. Uploaded

Key Features:

  1. Compact Size with Efficient Performance:
    The smaller parameter count (500M) ensures faster inference and reduced hardware requirements.

  2. Instruction Optimization:
    Fine-tuned to follow prompts effectively, making it suitable for interactive applications and prompt-based tasks.

  3. Domain-Specific Training:
    Trained on the GAIR/o1-journey dataset, providing tailored capabilities for specific use cases.


Training Details:


Capabilities:

  1. Instruction Following:

    • Generates accurate and coherent responses to user instructions.
    • Handles summarization, question-answering, and conversational tasks.
  2. Fast Inference:

    • Ideal for real-time applications due to reduced latency from its smaller size.
  3. Interactive AI Development:

    • Suitable for chatbots, virtual assistants, and instructional interfaces.

Usage Instructions:

  1. Setup:
    Download all model files, ensuring compatibility with the Hugging Face Transformers library.

  2. Loading the Model:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_name = "prithivMLmods/Acrux-500M-o1-Journey"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name)
    
  3. Sample Generate Text:

    input_text = "Explain the concept of machine learning in simple terms."
    inputs = tokenizer(input_text, return_tensors="pt")
    outputs = model.generate(**inputs, max_length=100, temperature=0.7)
    print(tokenizer.decode(outputs[0], skip_special_tokens=True))
    
  4. Optimize Generation:
    Adjust parameters in generation_config.json for better control of output, such as:

    • temperature for randomness.
    • top_p for sampling diversity.
    • max_length for output size.