Instructions to use infactory-ai/infactory_pulse1-4b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use infactory-ai/infactory_pulse1-4b with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("infactory-ai/infactory_pulse1-4b") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- MLX LM
How to use infactory-ai/infactory_pulse1-4b with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "infactory-ai/infactory_pulse1-4b"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "infactory-ai/infactory_pulse1-4b" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "infactory-ai/infactory_pulse1-4b", "messages": [ {"role": "user", "content": "Hello"} ] }'
Infactory Pulse 1 (4B)
Pulse 1 is a narrative intelligence model for extracting structured storylines from news and editorial content.
What It Does
Given an article, Pulse 1 extracts storylines — not just topics or keywords, but structured narratives that answer:
- WHO is acting (actor)
- WHAT they did (action)
- TO WHOM (target)
- WITH WHAT CONSEQUENCE (outcome)
Each storyline includes a concise name, a why explanation, and full
Semantic Role Labeling (SRL) decomposition in machine-readable JSON.
Why Storylines?
A storyline is more than a topic. Topics are static categories ("economy", "politics"). Storylines are dynamic narratives that capture cause-and-effect, developing situations, and active debates.
Example:
- Topic: "Federal Reserve"
- Storyline: "Federal Reserve interest rate increases slow U.S. housing market activity"
The storyline captures the actor (Fed), action (rate increases), target (housing market), and outcome (slowdown) — information you can act on.
Model Details
| Property | Value |
|---|---|
| Base model | Gemma 3 4B (text-only) |
| Architecture | Gemma 3 (34 transformer layers) |
| Format | MLX safetensors |
| Precision | bfloat16 |
| Context window | 4,096 tokens |
| Size | ~8.5 GB |
Output Format
The model returns a JSON object with a topics array. Each storyline has six fields:
{
"topics": [
{
"name": "Federal Reserve interest rate increases slow U.S. housing market activity",
"why": "The Fed's tightening cycle is making mortgages unaffordable for first-time buyers.",
"actor": "Federal Reserve",
"action": "increases interest rates",
"target": "U.S. housing market",
"outcome": "mortgage applications fall to decade lows"
}
]
}
| Field | Description |
|---|---|
name |
A specific narrative phrase (6+ words) with concrete actor, action, and outcome |
why |
One-sentence summary of why this storyline matters |
actor |
The primary entity taking action (use full names, not abbreviations) |
action |
What they did or are doing |
target |
Who or what is affected |
outcome |
The consequence or result |
Quick Start
With Ollama (recommended on Apple Silicon)
# Import the model (from this directory after downloading weights)
ollama create infactory_pulse1_4b -f Modelfile
ollama run infactory_pulse1_4b
Or use the OpenAI-compatible API:
curl -s http://localhost:11434/v1/chat/completions -d '{
"model": "infactory_pulse1_4b",
"messages": [
{"role": "user", "content": "Identify the key storylines discussed in this article.\n\nArticle:\nTitle: Fed Cuts Rates\nText: The Federal Reserve cut interest rates by 25 basis points today, citing slowing inflation and a cooling labor market."}
]
}'
Ollama runs Gemma 3 natively on Apple Silicon (MLX backend) and on CUDA for
NVIDIA hardware. The Modelfile sets num_ctx=4096, temperature=0.3,
top_p=0.9, and the appropriate Gemma 3 chat template.
With mlx_lm
For direct MLX inference:
pip install mlx-lm
python -m mlx_lm.generate \
--model ./ \
--prompt 'Identify the key storylines discussed in this article. ...'
Intended Use
Pulse 1 is designed for narrative intelligence workflows where you need to understand not just what an article is about, but what is happening — the actors, actions, and consequences.
Content Intelligence Pipelines
- Storyline extraction — Convert unstructured articles into structured narrative data
- Salience scoring — Score sentences against extracted storylines to find the most relevant passages
- Entity resolution — Ground storylines in detected entities for richer metadata
- Semantic search — Index and retrieve content by narrative dimensions (actor, action, target, outcome)
Media Monitoring & Analytics
- Narrative tracking — Monitor how storylines evolve across publications over time
- Trend detection — Identify emerging storylines by aggregating across article streams
- Brand-adjacent content discovery — Find articles whose narratives align with brand themes
- Competitive intelligence — Track storylines mentioning specific companies, products, or people
Editorial & Publishing Workflows
- Automated tagging — Generate structured metadata for content management systems
- Pull quote extraction — Score article sentences to surface the most quotable passages
- Evergreen content discovery — Find archival articles newly relevant to today's storylines
- Newsletter curation — Cluster and summarize articles by shared narratives
Research & Analysis
- Narrative framing analysis — Study how different publications frame the same events
- Discourse mapping — Understand the actors and relationships in a topic area
- Information extraction — Build structured datasets from news corpora
Out-of-Scope Use
- General-purpose chat or instruction following
- Languages other than English
- Domains outside news and editorial content
- Real-time or safety-critical applications
Files
| File | Description |
|---|---|
model-00001-of-00002.safetensors |
Model weights (shard 1/2) |
model-00002-of-00002.safetensors |
Model weights (shard 2/2) |
model.safetensors.index.json |
Weight shard index |
config.json |
Model architecture configuration |
tokenizer.json |
Fast tokenizer |
tokenizer_config.json |
Tokenizer settings |
chat_template.jinja |
Chat template (Gemma turn format) |
Modelfile |
Ollama-compatible model definition |
License
This model is a fine-tuned derivative of Google's Gemma 3 and inherits the Gemma Terms of Use.
- Permitted: Research, personal use, and commercial use in products and services.
- Required: You must agree to the Gemma Terms of Use before downloading or using the weights. Redistributions must include the license terms.
- Prohibited: Use for unlawful purposes, generating harmful content, or circumventing safety filters. The Gemma Prohibited Use Policy applies.
- Attribution: Derivative models must acknowledge Gemma as the base model.
The Infactory-specific fine-tuning is proprietary to Infactory AI.
Links
- Downloads last month
- 10
Quantized