CPath - Canadian Academic Pathfinder

CPath is a specialized language model fine-tuned on Canadian university program information, built to provide accurate guidance about academic programs, admission requirements, and educational pathways at Canadian institutions.

Model Description

CPath is based on TinyLlama and has been specifically fine-tuned on a curated dataset of 23,970 question-answer pairs about Canadian university programs. The model specializes in:

  • Providing detailed program information
  • Explaining admission requirements and processes
  • Describing course structures and academic pathways
  • Offering guidance on university selection
  • Answering specific questions about Canadian universities

Training Details

  • Base Model: TinyLlama-1.1B
  • Training Data: 23,970 QA pairs from official university sources
  • Universities Covered: McGill University, University of British Columbia
  • Training Approach: Instruction fine-tuning with careful attention to academic accuracy

Intended Uses

This model is designed to assist:

  • Prospective students researching university programs
  • Academic advisors and counselors
  • Educational institutions
  • Anyone seeking accurate information about Canadian university programs

Limitations & Biases

  • Coverage currently limited to McGill and UBC
  • Information cutoff date: 2024
  • Should not be used as the sole source for admission decisions
  • May not cover all specialized programs or requirements
  • Responses should be verified against official university sources

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained("houcine-bdk/cpath-academic-search-model", torch_dtype=torch.float16)
tokenizer = AutoTokenizer.from_pretrained("houcine-bdk/cpath-academic-search-model")

# Format your question
def get_response(question):
    prompt = f"[INST] {question} [/INST]"
    inputs = tokenizer(prompt, return_tensors="pt")
    
    # Generate response
    outputs = model.generate(
        **inputs,
        max_length=512,
        temperature=0.7,
        top_p=0.95,
        repetition_penalty=1.15
    )
    
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response.split("[/INST]")[-1].strip()

# Example usage
question = "What are the admission requirements for Computer Science at McGill?"
response = get_response(question)
print(response)

Ethical Considerations

  • The model should be used as an informational tool, not as a replacement for official university guidance
  • All information should be verified against official university sources
  • The model may occasionally generate incorrect information and should not be used for critical decisions

Training Data

The model was trained on the Canadian Universities Q&A Dataset, which contains carefully curated information from official university websites. The dataset is available at: houcine-bdk/cpath-mcgill-ubc

License

This model is released under the Apache 2.0 License.

Citation

If you use this model in your research, please cite:

@software{cpath_2025,
    title={CPath: Canadian Academic Pathfinder},
    author={houcine-bdk},
    year={2025},
    publisher={Hugging Face},
    url={https://huggingface.co/houcine-bdk/cpath-academic-search-model}
}

Contact

For questions or issues:

  • HuggingFace: houcine-bdk
Downloads last month
9
Safetensors
Model size
1.1B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Model tree for houcine-bdk/cpath-academic-search-model

Finetuned
(93)
this model

Collection including houcine-bdk/cpath-academic-search-model