Nyakult commited on
Commit
b8613c3
·
verified ·
1 Parent(s): d811ad1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +255 -3
README.md CHANGED
@@ -1,3 +1,255 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ - zh
6
+ base_model:
7
+ - Qwen/Qwen3-4B
8
+ - Qwen/Qwen3-1.7B
9
+ - Qwen/Qwen3-0.6B
10
+ library_name: transformers
11
+ ---
12
+ ## Model Overview
13
+
14
+ Memory Operator is a specialized language model developed for MemOS, designed to handle memory-related operations. Its core capabilities include **memory extraction, integration, and update**. The primary objectives for developing the Memory Operator sub-model are:
15
+
16
+ 1. **Support local-only deployment**, enabling the use of MemOS in restricted environments where internet connectivity is unavailable.
17
+ 2. **Achieve memory operations at lower cost and higher speed**, while maintaining high system performance.
18
+
19
+ We are releasing the MemOperator model series in three sizes: **4B, 1.7B, and 0.6B parameters**. These models are fine-tuned from the **Qwen3 series**, trained using supervised fine-tuning (SFT) on a combination of human-annotated and model-generated data. They demonstrate excellent performance in tasks such as memory extraction and reorganization.
20
+
21
+ Currently, the memory operation model supports **memory extraction** and **clustering-based memory reorganization** within the MemOS system. Conflict resolution and relational reasoning are under active development (WIP).
22
+
23
+ ### Key Features
24
+ - **Type**: Causal Language Model (Decoder-only)
25
+ - **Training Stage**: Supervised Fine-tuning (SFT)
26
+ - **Supported Languages**: English (en), Chinese (zh)
27
+ - **Number of Parameters**: 4B, 1.7B, 0.6B
28
+ - **Context Length**: 32,768 tokens
29
+
30
+ ---
31
+
32
+ ## Highlights
33
+
34
+ ### 🚀 Faster and More Efficient Memory Operations
35
+ Memory Operator is optimized for fast and accurate memory handling, enabling real-time processing in local environments.
36
+
37
+ ### 🧠 Comprehensive Memory Management
38
+ - **Memory Extraction**: Supports extraction of high-quality memories from both conversations and documents, including summarization of document snippets.
39
+ - **Memory Reorganization**: Implements clustering-based reorganization to group and integrate related memories, enhancing long-term memory coherence.
40
+
41
+ ### 💻 High System Performance with Low Resource Usage
42
+ - The **4B model** delivers performance that surpasses GPT-4o-mini while remaining deployable on most consumer-grade hardware.
43
+ - The smaller **1.7B and 0.6B variants** retain strong performance, making them ideal for edge devices and low-latency applications.
44
+
45
+ ### 🌍 Multilingual Support
46
+ - Supports memory extraction in **both Chinese and English**.
47
+ - Effectively follows instructions in the input language, ensuring accurate and context-aware outputs.
48
+
49
+
50
+
51
+ ## Performance
52
+
53
+ ### Memory Extraction & Integration Evaluation (locomo benchmark)
54
+
55
+ | Model | Overall | Temporal Reasoning | Multi-Hop | Single-Hop | Open-Domain |
56
+ |-------|--------|---------------------|----------|-----------|-------------|
57
+ | Qwen3-32B | 0.7675 | 0.7103 | 0.6702 | 0.8442 | 0.5729 |
58
+ | Qwen3-14B | 0.7370 | 0.6822 | 0.6631 | 0.8002 | 0.5833 |
59
+ | **MemOperator-4B** | **0.7714** | **0.8037** | **0.6737** | **0.8180** | **0.5416** |
60
+ | **MemOperator-1.7B** | **0.7571** | **0.8068** | **0.6560** | **0.7955** | **0.5521** |
61
+ | **MemOperator-0.6B** | **0.6753** | **0.6635** | **0.5780** | **0.7325** | **0.5000** |
62
+ | GPT-4o-mini | 0.7405 | 0.7217 | 0.6844 | 0.7864 | 0.5659 |
63
+ | Qwen3-8B | 0.6994 | 0.4984 | 0.7092 | 0.7943 | 0.5104 |
64
+
65
+ > ✅ **Key Advantage**:
66
+ > By replacing large open-source models (e.g., Qwen3-32B) with **MemOperator-4B**, you can achieve **comparable or better memory processing performance** while reducing **resource consumption by over 80%** (4B vs 32B). This enables efficient, scalable, and cost-effective deployment.
67
+
68
+ ---
69
+
70
+ ## Usage
71
+
72
+ ### MemOS Integration Guide
73
+
74
+ You can easily configure MemOS to use the trained MemReader model for memory extraction.
75
+
76
+ #### 1. Install MemOS via pip
77
+
78
+ ```bash
79
+ pip install MemoryOS
80
+ ```
81
+ #### 2. Initialize a MemOperator and Extract Memory
82
+ ```Python
83
+ from memos.configs.mem_reader import SimpleStructMemReaderConfig
84
+ from memos.mem_reader.simple_struct import SimpleStructMemReader
85
+
86
+ config = SimpleStructMemReaderConfig(
87
+ **{
88
+ "llm": {
89
+ "backend": "huggingface",
90
+ "config": {
91
+ "model_name_or_path": "MemTensor/MemOperator-0.6B",
92
+ "temperature": 0.6,
93
+ "max_tokens": 6000,
94
+ "top_p": 0.95,
95
+ "top_k": 20,
96
+ },
97
+ },
98
+ "embedder": {
99
+ "backend": "ollama",
100
+ "config": {"model_name_or_path": "nomic-embed-text:latest"},
101
+ },
102
+ "chunker": {
103
+ "backend": "sentence",
104
+ "config": {
105
+ "tokenizer_or_token_counter": "gpt2",
106
+ "chunk_size": 512,
107
+ "chunk_overlap": 128,
108
+ "min_sentences_per_chunk": 1,
109
+ },
110
+ },
111
+ "remove_prompt_example": True,
112
+ }
113
+ )
114
+
115
+ reader = SimpleStructMemReader(config)
116
+
117
+ # Example chat data
118
+ chat_data = [
119
+ [
120
+ {
121
+ "role": "user",
122
+ "chat_time": "June 26, 2025 at 3:00 PM",
123
+ "content": "Hi Jerry! Yesterday at 3 PM I had a meeting with my team about the new project.",
124
+ },
125
+ {
126
+ "role": "assistant",
127
+ "chat_time": "June 26, 2025 at 3:00 PM",
128
+ "content": "Oh Tom! Do you think the team can finish by December 15?",
129
+ },
130
+ {
131
+ "role": "user",
132
+ "chat_time": "June 26, 2025 at 3:00 PM",
133
+ "content": "I’m worried. The backend won’t be done until December 10, so testing will be tight.",
134
+ },
135
+ {
136
+ "role": "assistant",
137
+ "chat_time": "June 26, 2025 at 3:00 PM",
138
+ "content": "Maybe propose an extension?",
139
+ },
140
+ {
141
+ "role": "user",
142
+ "chat_time": "June 26, 2025 at 4:21 PM",
143
+ "content": "Good idea. I’ll raise it in tomorrow’s 9:30 AM meeting—maybe shift the deadline to January 5.",
144
+ },
145
+ ]
146
+ ]
147
+
148
+ # Save document for testing
149
+ with open("tmp.txt", "w") as f:
150
+ f.write(
151
+ "Lou Henry Hoover (March 29, 1874 – January 7, 1944) was an American philanthropist, geologist, and the first lady of the United States from 1929 to 1933 as the wife of President Herbert Hoover. She was active in community organizations and volunteer groups throughout her life, including the Girl Scouts of the USA, which she led from 1922 to 1925 and from 1935 to 1937. Throughout her life, Hoover supported women's rights and women's independence. She was a polyglot, fluent in Mandarin and well-versed in Latin, and was the primary translator from Latin to English of the complex 16th-century metallurgy text De re metallica."
152
+ )
153
+
154
+ # Extract chat and document memories
155
+ chat_memory = reader.get_memory(
156
+ chat_data, type="chat", info={"user_id": "Tom", "session_id": "session1"}
157
+ )
158
+ doc_memory = reader.get_memory(
159
+ ["tmp.txt"],
160
+ "doc",
161
+ info={
162
+ "user_id": "Tom",
163
+ "session_id": "session2",
164
+ },
165
+ )
166
+
167
+ print(chat_memory)
168
+ print(doc_memory)
169
+ ```
170
+ #### 3. Use MemOperator to Organize Memory in MemOS
171
+ Configure your mem_cube_config.json:
172
+ ```Json
173
+ {
174
+ "...": "...",
175
+ "reorganize": true,
176
+ "text_mem": {
177
+ "backend": "tree_text",
178
+ "config": {
179
+ "extractor_llm": {
180
+ "backend": "huggingface",
181
+ "config": {
182
+ "model_name_or_path": "MemTensor/MemOperator-0.6B",
183
+ "temperature": 0.8,
184
+ "max_tokens": 1024,
185
+ "top_p": 0.9,
186
+ "top_k": 50
187
+ }
188
+ },
189
+ "dispatcher_llm": {
190
+ "backend": "huggingface",
191
+ "config": {
192
+ "model_name_or_path": "Qwen/Qwen-1.8B-Chat",
193
+ "temperature": 0.7,
194
+ "max_tokens": 512
195
+ }
196
+ },
197
+ "graph_db": {
198
+ "backend": "neo4j",
199
+ "config": {
200
+ "uri": "bolt://localhost:7687",
201
+ "username": "neo4j",
202
+ "password": "your_password"
203
+ }
204
+ },
205
+ "embedder": {
206
+ "backend": "sentence_transformers",
207
+ "config": {
208
+ "model_name_or_path": "all-MiniLM-L6-v2"
209
+ }
210
+ }
211
+ }
212
+ },
213
+ "act_mem": {},
214
+ "para_mem": {}
215
+ }
216
+ ```
217
+ #### 4. Initialize MemOS and Register Memory Cube
218
+
219
+ ```Python
220
+ import json
221
+ from memos import GeneralMemCubeConfig, GeneralMemCube, MOSConfig
222
+ from memos.mem_os.main import MOS
223
+
224
+ # Initialize MOS
225
+ user_id = 'test'
226
+ mos_config_path = "configs/mos_memos_config.json"
227
+ mos_config_data = json.load(open(mos_config_path))
228
+ mos_config = MOSConfig(**mos_config_data)
229
+ mos = MOS(mos_config)
230
+ mos.create_user(user_id=user_id)
231
+
232
+ # Configure and initialize memory cube
233
+ mem_cube_config_path = "configs/mem_cube_config.json"
234
+ mem_cube_config_data = json.load(open(mem_cube_config_path))
235
+ mem_cube_config = GeneralMemCubeConfig.model_validate(mem_cube_config_data)
236
+ mem_cube = GeneralMemCube(mem_cube_config)
237
+
238
+ # Register memory cube to MOS
239
+ storage_path = f"./{user_id}_cube"
240
+ try:
241
+ mem_cube.dump(storage_path)
242
+ except Exception as e:
243
+ print(f"Memory cube already exists at {storage_path}, will reuse it.")
244
+
245
+ mos.register_mem_cube(
246
+ mem_cube_name_or_path=storage_path,
247
+ mem_cube_id=user_id,
248
+ user_id=user_id,
249
+ )
250
+ ```
251
+
252
+
253
+ ## Huggingface Usage
254
+
255
+ You can also directly load the model via Huggingface, vLLM, or SGLang and perform memory extraction using the preset templates we have configured.