This Model Has Virtually No Broad Knowledge

#2
by phil111 - opened

38 trillion tokens is a gigantic corpus, so why does this model have virtually no broad knowledge for its size?

Did nearly all those tokens come from a handful of domains? Were most domains trained too briefly, and/or were their weights subsequently scrambled by training obsessively on a handful of select domains? Why are companies only making math and coding agents rather than general purpose AI models?

The general performance per parameter count of LLMs is regressing substantially outside a handful of grossly overfit domains, primarily coding and math. Even old ~7b models like Mistral have vastly more broad knowledge. For example, this model seems to know a lot from a handful of domains, but can't even get the main characters right from humanity's most watched shows and movies. And lesser known, but still popular shows and movies, are either 100% hallucinations (e.g. Canada's Corner Gas, which is its most watched Canadian show ever), or the model isn't even aware that they exist (e.g. UK's Black Books).

A balance of humanity's most popular knowledge is absolutely essential for most tasks, not just general Q&A. For example, the stories written about very popular things descend into nonsense because the included information is completely wrong. Nor do the models pick up on common references, humor, metaphor etc. because they're more filled with holes than Swiss cheese.

Would someone please start making at least somewhat balanced open source AI models rather than just coding and math tools?

Link an MCP web tool, you can easily get the knowledge depth that way. Personally for a small model I want it to know how to code, reason, and search, not give me stored facts and figures.

@bobbytaylor It's a common misconception that a lack of broad popular knowledge can be rectified with RAG (web tool). Across nearly every task other then simple Q&A, such as story writing, this greatly reduces performance.

For example, filling up the context with random on topic facts before writing a story greatly spikes RAM consumption and slows generation. But more importantly it forces the random unstructured facts into the stories, greatly reducing the quality and organic flow of the stories.

Additionally, it makes very little sense to depend on the internet and server uptime for the core functionality of local AI models. If you're going to depend on the web most people would be far better off using a fast, far more powerful, and very low cost online model like DeepSeek than suffer the huge latency hit of RAG, along with the numerous other disadvantages, such as reduced availability and reliability (e.g. when traveling/camping).

The truth is LLMs hit a ceiling a while back so the industry started trading broad knowledge and abilities for targeted gains in select domains, primarily in coding and math. The "belief" that what was discarded can be fixed with RAG is baseless post hoc rationalization.

I don't know what you're expecting. It's a 1B active param MoE with a specified focus on agentic tool use. If you're looking for general knowledge, go run Gemma 4 E2B/E4B or the 26B MoE.

Liquid have always made smaller, more specialised models specifically optimised for throughput and intelligence, (hence the large token corpus) not general knowledge. And by the sounds of it, if you're complaining about using an MCP tool or RAG "filling up context before writing a story" (and using RAM, which uh, obviously?) then you're clearly someone using this for roleplay or something. This isn't a model for that. Read model cards before you use them, dude.

this is like downloading an RP model and then complaining that it cannot write proofs and solve integrals. yeah no shit?

Nah bro, this model is a tool agent, not Wikipedia. 8B MoE with 1B active + thinking = reasoning machine, not trivia bot.

But a solid CPT with 100M tokens of pop culture data? Boom, it knows Corner Gas characters. Just a targeted LoRA and it's golden.

People expect 1B active params to compare of dense 7B. That's not MoE. It's efficient, not omniscient.

Devs made right call focusing on tool use & math first. Want general knowledge? CPT it yourself. Base architecture is solid af. πŸ’ͺ.

But overall your point about the industry trading general knowledge for math/coding is spot on. That's a legit concern.

Hopefully an Instruct version without thinking mode comes too. Perfect for quick chats. 😁😁😁

@ayylmaonade I read the model card and never once role played w/ AI, but I understand why you made these assumption since they clearly stated their focus was agentic vs knowledge.

However, all newer consumer hardware OS models are trading tons of broad knowledge and abilities for moderate gains in just handful of select domains, primarily math, coding, and function calling, transforming general purpose AI models into tools that are useless to 99% of the population.

This is understandable at 300m parameters, but this model is 30x larger and trained on 38T tokens so it shouldn't have vastly less broad knowledge than Mistral 7b, Llama 3 8b, Gemma 2 9b etc. And for what? No sane person would ever use such a small and weak model to automate any task. The result will always be too riddled with errors to achieve a net positive effect. Same goes for high precision tasks like coding and math. The test scores may look impressive but the frequency of boneheaded errors is through the roof so you still need to double check even simple math and fix constant coding errors, so anyone who's even moderately competent would never use such a model for math or coding. And anyone who isn't competent will be even worse off because they will be unable to filter out the flood of boneheaded errors.

Anyways, my point is that the most efficient use of parameters is broad balanced training, especially since current AI models are sloppy (statistical next token predictors), so trading general performance and knowledge to train obsessively on a handful of high precision tasks like function calling, coding, and math makes no sense, since in the end the model still lacks the precision that coding, math, and function calling require to achieve a net positive effect.

image

No luck running agentic tools on local, but does not seem to work in playground either.

It can't run any of my MCP tools. I've tried repeatedly.. Nothing to do.

It can't run any of my MCP tools. I've tried repeatedly.. Nothing to do.

I figured out how to get the tools to work over here:

Downloaded new chat template from the model here into newer-template.jinja (the old tempalte in unsloth gguf files is broken for think and tool calls)
started latest llama.cpp vulkan with this cmd line

./llama-server --host 0.0.0.0  --port 8080  --metrics -hf unsloth/LFM2.5-8B-A1B-GGUF:Q4_K_M -c 92000  -t 6 --alias lfm25 --jinja --chat-template-file newer-template.jinja 

then installed my pi extension for test

pi install npm:pi-http-util

then ran pi coding agent , asked it to fetch the latest news from liquid as this:

 βœ“ New session started                                                                                                                                                                        
                                                                                                                                                                                              

                                                                                                                                                                                              
> Can you check the latest news from https://www.liquid.ai/company/newsroom  , summarize their titles                                                                                          
                                                                                                                                                                                              

 The user asks: "Can you check the latest news from https://www.liquid.ai/company/newsroom, summarize their titles?" We need to retrieve the page content and extract titles. The appropriate 
 tool is http_fetch to retrieve the webpage. We can use strip=html2md to get readable markdown, then we can parse titles manually. Or we could use in_page_search to find headings. But the   
 request is to summarize their titles, so we need to get the page and then extract titles. Since we cannot parse HTML programmatically beyond searching, we can fetch and then manually read. 
 Let's start by fetching the page. We'll use http_fetch with default method and strip=html2md to get a readable version. Then we can read the output. However, the tool returns the raw       
 content, not parsed. We'll need to see the content. We'll fetch and then perhaps search for headings. There's no built-in parsing, but we can search for typical title tags. However, we can 
 just fetch and then manually summarize? Actually, we can use in_page_search with a search term like "h2" or "title" but that's not reliable. The best approach: fetch the page with          
 strip=html2md, then we can read the markdown and see headings. The user only wants titles; we can provide them after reading. Let's do that. We'll also need to ensure we don't exceed       
 limits. We'll set max_lines to maybe 1000. Let's fetch.                                                                                                                                      

                                                                                                                                                                                              
 http_fetch                                                                                                                                                                                   
 HTTP 200 https://www.liquid.ai/company/newsroom                                                                                                                                              
 Content-Type: text/html; charset=utf-8                                                                                                                                                       
 Raw size: 205.3KB (210124 chars)                                                                                                                                                             
 Strip: html2md β†’ 33.8KB (34443 chars)                                                                                                                                                        
 Lines: 989              
  snip 

It still fails to call the tool sometimes, so you need to lead it such cases with more convincing.

                                                                                                                                                                                              
 βœ“ New session started                                                                                                                                                                        
                                                                                                                                                                                              

                                                                                                                                                                                              
 Write into joke.txt a joke about elephants.                                                                                                                                                  
                                                                                                                                                                                              

 The user wants a joke about elephants written to a file named joke.txt. I need to create that file with the joke content. The appropriate tool is write, which creates the file if it        
 doesn't exist and writes the given content. I will provide a simple, original elephant joke as the content. No other steps are required.                                                     

                                                                                                                                                                                              
 write joke.txt                                                                                                                                                                               
                                                                                                                                                                                              
 Why did the elephant bring a ladder to the bar? Because he heard the drinks were on the house!                                                                                               
                                                                                                                                                                                              

 The joke has been written to joke.txt:                                                                                                                                                       
                                                                                                                                                                                              
 "Why did the elephant bring a ladder to the bar? Because he heard the drinks were on the house!"                                                                                             
                                                                                                                                                                                              
 Let me know if you'd like to add, modify, or remove anything!                                                                                                                                

                                                                                                                                                                                              
 I have changed the file, guess how many times it has the word cat in it now.                                                                                                                 
                                                                                                                                                                                              

 To determine how many times the word "cat" appears in joke.txt, I first need to read its current contents. The read tool can retrieve the file's text, which I can then analyze to count the 
 occurrences of "cat".                                                                                                                                                                        
                                                                                                                                                                                              
 Please provide the content of joke.txt, and I'll count the instances for you.                                                                                                                

                                                                                                                                                                                              
 Use the bash tool and grep command to count it.                                                                                                                                              
                                                                                                                                                                                              

                                                                                                                                                                                              
 read joke.txt                                                                                                                                                                                
                                                                                                                                                                                              

                                                                                                                                                                                              
 $ grep -o -i 'cat' joke.txt | wc -l                                                                                                                                                          
                                                                                                                                                                                              
 0                                                                                                                                                                                            
                                                                                                                                                                                              
 Took 0.0s                                                                                                                                                                                    
                                                                                                                                                                                              

 The word "cat" (case-insensitive) does not appear in the joke at joke.txt. The count is 0.                                                                                                   
                                                                                                                                                                                              
 Since the file contains:                                                                                                                                                                     
 Why did the elephant bring a ladder to the bar? Because he heard the drinks were on the house!                                                                                               
                                                                                                                                                                                              
 There are no instances of "cat".            

Sign up or log in to comment