Instructions to use Mandotosh/foresightlm-core-distilgpt2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Mandotosh/foresightlm-core-distilgpt2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Mandotosh/foresightlm-core-distilgpt2")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Mandotosh/foresightlm-core-distilgpt2") model = AutoModelForCausalLM.from_pretrained("Mandotosh/foresightlm-core-distilgpt2") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Mandotosh/foresightlm-core-distilgpt2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Mandotosh/foresightlm-core-distilgpt2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Mandotosh/foresightlm-core-distilgpt2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Mandotosh/foresightlm-core-distilgpt2
- SGLang
How to use Mandotosh/foresightlm-core-distilgpt2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Mandotosh/foresightlm-core-distilgpt2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Mandotosh/foresightlm-core-distilgpt2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Mandotosh/foresightlm-core-distilgpt2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Mandotosh/foresightlm-core-distilgpt2", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Mandotosh/foresightlm-core-distilgpt2 with Docker Model Runner:
docker model run hf.co/Mandotosh/foresightlm-core-distilgpt2
ForesightLM Core DistilGPT-2
This repository contains the Core ForesightLM checkpoint based on distilgpt2.
ForesightLM studies whether a token-level autoregressive language model can acquire sentence-level foresight through an auxiliary sentence-boundary future semantic objective. The model preserves standard next-token generation while adding a learned projection head for future sentence embedding prediction.
Model components
- Base language model:
distilgpt2 - Core checkpoint: ForesightLM seed 42
- Sentence encoder used during training/evaluation:
sentence-transformers/all-MiniLM-L6-v2 - Future objective: sentence-boundary contrastive future embedding prediction
- Future-loss weight:
lambda_future = 0.08 - Contrastive temperature:
tau = 0.07
Intended use
This checkpoint is intended for research on:
- autoregressive language modeling
- sentence-level semantic planning
- discourse coherence diagnostics
- semantic reranking
- future-representation calibration
Important limitations
This model is a small research prototype. It should not be treated as a production-quality text generator.
Automatic metrics show that semantic reranking is a strong component by itself. Foresight training improves several diagnostics but does not uniformly dominate a reranked baseline. Direct future-head reranking exposes a calibration gap.
Human evaluation protocol files are released in the GitHub repository, but human judgments are still being collected and will be added in a later revision.
Reproducibility
Code, SLURM scripts, evaluation summaries, compute-cost accounting, bootstrap confidence intervals, qualitative examples, and reproducibility manifests are available at:
https://github.com/Ahmet2001/foresightLM
Large generation JSONL files and training data are not included in this model repository.
Citation
If you use this checkpoint, please cite the ForesightLM project repository until a paper DOI/arXiv identifier is available.
- Downloads last month
- 17
Model tree for Mandotosh/foresightlm-core-distilgpt2
Base model
distilbert/distilgpt2