Fin.AI / README.md

IshaanMan123

Update README.md

f1419db verified about 18 hours ago

preview code

raw

history blame contribute delete

7.99 kB

metadata

license: mit
tags:
  - pytorch
  - gpt2
  - text-generation
  - fin-ai
  - experimental
  - in-training
  - from-scratch
  - automated-training
language:
  - en
datasets:
  - wikitext
  - roneneldan/TinyStories
  - openai/gsm8k
  - squad
  - imdb
  - ag_news
  - yelp_review_full
  - cnn_dailymail
  - billsum
  - commonsense_qa
  - hellaswag
  - winogrande
  - boolq
  - race
  - stanfordnlp/coqa
  - allenai/c4
  - Skylion007/openwebtext
  - trivia_qa
  - hotpot_qa
  - microsoft/ms_marco
  - duorc
  - amazon_polarity
  - zeroshot/twitter-financial-news-sentiment
  - sciq
  - quail
  - wiki_qa
  - paws
  - medical_questions_pairs
  - app_reviews
  - rotten_tomatoes
metrics:
  - perplexity
library_name: pytorch
pipeline_tag: text-generation

🤖 Fin.AI v2.0

⚠️ EXPERIMENTAL MODEL - Training from scratch

GitHub • Training Logs • Report Issue

🚨 Important Notice

This model is training from scratch and outputs will be gibberish initially.

🔴 Brand new model - Starting from random weights
⏳ Training time needed: 2-4 weeks for basic coherence
🤖 Automated training: Every 1 hour 10 minutes via GitHub Actions
📊 Current quality: Expect complete nonsense initially
🎯 Purpose: Research/experimental continuous learning

📊 Model Overview

Specification	Value
Architecture	GPT-2 style Transformer
Parameters	30,142,848 (~30M)
Layers	6
Attention Heads	6
Embedding Dimension	384
Feed-Forward Dimension	1,536
Max Sequence Length	512 tokens
Vocabulary Size	50,257 (GPT-2 tokenizer)
Position Encoding	Rotary (RoPE)
Activation	GELU

🎯 Training Details

Training Schedule

Frequency: Every 1 hour 10 minutes (6 cycles/hour)
Steps per cycle: 800 steps
Daily steps: ~115,200 steps
Weekly steps: ~806,400 steps
Batch size: 8 (effective: 32 with gradient accumulation)
Learning rate: 3e-4 with cosine decay
Warmup steps: 100

Training Infrastructure

Platform: GitHub Actions (free tier)
Hardware: CPU only
Training time: ~15-20 minutes per cycle
Automatic upload: To Hugging Face after each cycle

Datasets (30 total, rotating hourly)

The model trains on a diverse set of 30 datasets, cycling through one per hour:

📚 Knowledge & Reference

WikiText-2, OpenWebText, C4

✍️ Creative Writing

TinyStories

📰 News & Articles

CNN/DailyMail, AG News, Billsum

❓ Question Answering

SQuAD, CoQA, TriviaQA, HotpotQA, MS MARCO, WikiQA, Quail

🧠 Reasoning & Logic

GSM8K (Math), Common Sense QA, HellaSwag, WinoGrande, BoolQ

📖 Reading Comprehension

RACE, DuoRC

💬 Reviews & Sentiment

IMDB, Yelp, Amazon Polarity, Rotten Tomatoes, App Reviews

🔬 Scientific & Medical

SciQ, Medical Questions

💰 Financial

Twitter Financial News

🔄 Paraphrase & Similarity

PAWS

📈 Training Progress

Current Status

Version: v2.0.0
Training started: December 28, 2024
Model type: fresh_init
Total parameters: 30,142,848

Expected Timeline

Week	Expected Quality	Description
1	🔴 Gibberish	Random weights, no coherence
2	🟠 Patterns	Some token patterns emerging
3-4	🟡 Basic	Simple word sequences
5-8	🟢 Improving	Short coherent phrases
9-12	🔵 Decent	Usable for simple tasks

Monitoring

GitHub Actions: View Training Runs
Wandb Dashboard: View Metrics
Model Updates: This page updates automatically

💻 Usage

Installation

pip install torch transformers huggingface-hub

Download Model

from huggingface_hub import hf_hub_download
import os

# Create directory
os.makedirs("./fin_ai_model", exist_ok=True)

# Download model files
hf_hub_download("MeridianAlgo/Fin.AI", "model.pt", local_dir="./fin_ai_model")
hf_hub_download("MeridianAlgo/Fin.AI", "config.json", local_dir="./fin_ai_model")

Generate Text (Experimental)

from fin_ai.model import FinAIModel
import torch
from transformers import AutoTokenizer

# Load model
model = FinAIModel.from_pretrained("./fin_ai_model")
model.eval()

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("gpt2")

# Generate text (expect poor quality initially)
input_text = "Once upon a time"
input_ids = tokenizer.encode(input_text, return_tensors="pt")

with torch.no_grad():
    output = model.generate(
        input_ids,
        max_length=50,
        temperature=0.8,
        top_p=0.9,
        do_sample=True,
    )

generated_text = tokenizer.decode(output[0])
print(generated_text)

# Note: Output quality is poor initially and improves over weeks

🔬 Technical Details

Architecture Improvements (v2.0)

Compared to v1.x:

✅ 3x more parameters (10M → 30M)
✅ Better architecture (4 layers → 6 layers)
✅ Larger embeddings (256 → 384 dimensions)
✅ More attention heads (4 → 6 heads)
✅ Improved training (600 → 800 steps/cycle)

Training Configuration

model:
  size_preset: "small"
  n_layers: 6
  n_heads: 6
  embed_dim: 384
  ff_dim: 1536
  max_seq_len: 512

training:
  batch_size: 8
  gradient_accumulation_steps: 4
  learning_rate: 3.0e-4
  weight_decay: 0.01
  warmup_steps: 100
  max_steps: 800

📊 Evaluation

Metrics Tracked

Training Loss: Cross-entropy loss
Perplexity: exp(loss)
Tokens/Second: Training throughput
Learning Rate: Cosine schedule with warmup
Gradient Norm: For stability monitoring

Benchmarks (Coming Soon)

Once the model reaches basic coherence, we'll evaluate on:

HellaSwag (common sense)
LAMBADA (reading comprehension)
WikiText perplexity
Custom generation quality tests

⚠️ Limitations

Early Training: Model is in very early training stages
Output Quality: Expect gibberish for several weeks
CPU Training: Slower than GPU training
Small Model: 30M parameters is relatively small
Limited Context: 512 token context window
No Fine-tuning: Base model only, not instruction-tuned
English Only: Trained primarily on English text

🤝 Contributing

This is an open research project! Contributions welcome:

Code: GitHub Repository
Issues: Report Problems
Discussions: Join Discussion

📜 License

MIT License - See LICENSE

🔗 Links

Repository: https://github.com/MeridianAlgo/FinAI
Training Logs: https://wandb.ai/meridianalgo-meridianalgo/fin-ai
GitHub Actions: https://github.com/MeridianAlgo/FinAI/actions
Issues: https://github.com/MeridianAlgo/FinAI/issues

Last Updated: 2025-12-28 17:54 UTC

Status: 🔴 Training from Scratch

Quality: ⚠️ Expect Gibberish (2-4 weeks needed)