How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Rohanify/HinTexta-34M-GGUF",
	filename="Hintexta-Hf-34M-F16.gguf",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

HinTexta-34M โ€” Hinglish Chat Model

A 34M parameter GPT-2 model trained from scratch on 1M Hinglish conversations. Speaks naturally in Hinglish (Hindi + English mix).

Quick Start (Ollama)

ollama run hf.co/Rohanify/HinTexta-34M-GGUF:F16

No setup needed!

Details

  • Architecture: GPT-2 (8 layers, 512 dim, 8 heads)
  • Parameters: 34M
  • Training data: 1M synthetic Hinglish conversations (Abhishekcr448/Hinglish-Everyday-Conversations-1M)
  • Tokenizer: Custom 16K vocab ByteLevel BPE trained on Hinglish
  • Context window: 512 tokens
  • Trained from scratch on a single RTX 5080

Example

You: Bhai kya haal hai?
Bot: Bahut accha! Lekin mujhe toh ek adventurous trip ki yaad hai, dost!

You: Movie dekhne chalein?
Bot: Movie night toh awesome hoga, par ek adventurous trip bhi zaroori hai!
Downloads last month
551
GGUF
Model size
33.9M params
Architecture
gpt2
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Dataset used to train Rohanify/HinTexta-34M-GGUF