🧠 Model Card: LLaMA 3 8B Instruct – Conversational Roleplay Enhanced

Model Name: lex-au/vocalis-q4_k_m.gguf
Base Model: Meta LLaMA 3 8B Instruct
Fine-Tuned By: Lex (lex-au)
Model Type: Causal Decoder Transformer
License: Non-commercial (see Meta LLaMA 3 license)


✨ Overview

vocalis-q4_k_m.gguf is a fine-tuned and quantised variant of Meta’s LLaMA 3 8B Instruct model, purpose-built for immersive roleplaying and low-latency conversational AI in the Vocalis web app. This model tracks spatial and temporal context across turns, enabling natural, persistent narrative flows and highly dynamic character interactions.

Designed for low-resource environments, it balances responsiveness with rich generative quality, making it ideal for real-time applications.


πŸ›  Fine-Tuning Objective

  • Purpose: Enable vivid storytelling and contextually grounded interactions for roleplay-based use cases.
  • Focus Areas:
    • Spatial awareness (tracking objects, locations, movement)
    • Temporal sequencing (event continuity, time-of-day, memory within thread)
    • Expressive language generation with adaptable tone
  • Fine-Tuning Strategy: LoRA-based adaptation on multi-turn dialogues, fantasy/sci-fi roleplay transcripts, and narrative world-state tracking prompts.
  • Deployment Goal: Minimal latency, low memory footprint β€” optimised for running inside Vocalis

πŸ—£ Use Cases

  • Text-based RPG narration
  • Interactive fiction engines
  • AI-driven NPCs for browser games
  • Persistent-world simulation bots
  • Web-deployed immersive chat agents

βœ… Strengths

  • Maintains spatial & narrative continuity across turns
  • Generates richly descriptive, tone-sensitive prose
  • Adapts easily to character tone and dynamic environments
  • Ideal for 1st-person, 2nd-person, or multi-speaker scenarios

⚠️ Known Limitations

  • Context window constraints may cause memory loss over extended turns
  • Can over-describe or hallucinate detail in ambiguous settings
  • Not suited for task-based or factual Q&A applications
  • No long-term memory or external knowledge integration

πŸ’¬ Example Interaction

User:
You enter the crumbling temple. Where’s the light coming from?

Assistant:
As you step into the ancient temple, a soft, ethereal glow emanates from within its walls. The light seems to seep through the weathered stone, casting an otherworldly illumination on the dusty artifacts and worn carvings that line the passageway.

Upon closer inspection, you notice that the light is coming from a series of intricate, glowing symbols etched into the ceiling above. These mystical runes pulse with a gentle, blue-green luminescence, their soft radiance casting an enchanting atmosphere throughout the temple's interior.

The air inside the temple feels heavy with age and mystery, yet the soothing light from the celestial symbols imbues the space with a sense of peace and ancient wisdom. As you venture deeper into the labyrinthine structure, the glow seems to grow stronger, guiding you towards secrets that have lain dormant for centuries within the sacred halls.


πŸ“Ž Metadata

  • Format: GGUF
  • Quantisation: Q4_K_M
  • Optimised For: Real-time inference (Vocalis-compatible)
  • Recommended Frontends: OpenWebUI, KoboldCPP, LM Studio, or custom web clients (e.g., Flask/Vite apps)

πŸ”— Project Link

πŸ§ͺ This model was fine-tuned specifically for the Vocalis webapp project β€” a low-latency, voice-enabled AI assistant platform.


πŸ“– Citation & Attribution

The base model used for fine-tuning is Meta’s LLaMA 3 8B Instruct. For licensing, refer to Meta's license terms.

This repository contains a quantised version optimised for low-latency use in the Vocalis real-time webapp.

If you use this model in your research or application, please cite:

@misc{llama3-8b-2024,
  author = {Meta AI},
  title = {LLaMA 3 8B Instruct},
  year = {2024},
  publisher = {Meta},
  howpublished = {\url{https://ai.meta.com/llama/}}
}

@misc{vocalis-roleplay-2025,
  author = {Lex-au},
  title = {Vocalis-Q4_K_M: Fine-Tuned LLaMA 3 for Conversational Roleplay},
  note = {Low-latency roleplay model optimised for use in the Vocalis webapp},
  year = {2025},
  howpublished = {\url{https://github.com/Lex-au/Vocalis}}
}
Downloads last month
57
GGUF
Model size
8.03B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for lex-au/Vocalis-Q4_K_M.gguf

Quantized
(211)
this model

Collection including lex-au/Vocalis-Q4_K_M.gguf