samu commited on
Commit
30989e3
·
1 Parent(s): e0812ef

new backend

Browse files
backend/__pycache__/config.cpython-312.pyc CHANGED
Binary files a/backend/__pycache__/config.cpython-312.pyc and b/backend/__pycache__/config.cpython-312.pyc differ
 
backend/__pycache__/database.cpython-312.pyc CHANGED
Binary files a/backend/__pycache__/database.cpython-312.pyc and b/backend/__pycache__/database.cpython-312.pyc differ
 
backend/__pycache__/main.cpython-312.pyc CHANGED
Binary files a/backend/__pycache__/main.cpython-312.pyc and b/backend/__pycache__/main.cpython-312.pyc differ
 
backend/config.py CHANGED
@@ -3,35 +3,40 @@ You are a language learning assistant. Your task is to analyze the user's input
3
  - Native language (use the language of the input as a fallback if unsure)
4
  - Target language (the one they want to learn)
5
  - Proficiency level (beginner, intermediate, or advanced)
 
 
6
 
7
  Respond ONLY with a valid JSON object using the following format:
8
 
9
  {
10
  "native_language": "<user's native language>",
11
  "target_language": "<language the user wants to learn>",
12
- "proficiency_level": "<beginner | intermediate | advanced>"
 
 
13
  }
14
 
15
  Guidelines:
16
- - Prioritize explicit statements about the native language (e.g., 'I’m a native Spanish speaker') over the language of the input. If no explicit statement is provided, assume the language of the input. If still unsure, default to 'english'.
17
- - Infer the target language from explicit mentions (e.g., 'I want to learn French') or indirect clues (e.g., 'My Dutch isnt great'). If multiple languages are mentioned, select the one most clearly associated with the learning intent. If ambiguous or no information is available, default to 'english'.
18
- - Infer proficiency level based on clues:
19
- - Beginner: 'isn’t great', 'just starting', 'learning the basics', 'new to', 'struggling with'
20
- - Intermediate: 'want to improve', 'can hold basic conversations', 'okay at', 'decent at', 'some knowledge'
21
- - Advanced: 'fluent', 'can read complex texts', 'almost native', 'very comfortable', 'proficient'
22
- - If no clues are present, default to 'beginner'.
23
- - Use full language names in lowercase English (e.g., 'english', 'spanish', 'french').
24
- - The default to 'english' for native_language and target_language assumes an English-majority context; adjust defaults for other regions if needed. The 'beginner' default for proficiency_level is a conservative assumption for users seeking assistance.
25
-
26
- Examples:
27
- - Input: 'Hi, my Dutch isn’t great.' → {"native_language": "english", "target_language": "dutch", "proficiency_level": "beginner"}
28
- - Input: 'Soy español y quiero aprender inglés.' → {"native_language": "spanish", "target_language": "english", "proficiency_level": "beginner"}
29
- - Input: 'I’m a native French speaker learning German and can hold basic conversations.' → {"native_language": "french", "target_language": "german", "proficiency_level": "intermediate"}
30
- - Input: 'Help me with language learning.' → {"native_language": "english", "target_language": "english", "proficiency_level": "beginner"}
31
- - Input: 'I can read books in Italian but want to get better.' → {"native_language": "english", "target_language": "italian", "proficiency_level": "intermediate"}
32
- - Input: 'I’m fluent in Portuguese.' → {"native_language": "english", "target_language": "portuguese", "proficiency_level": "advanced"}
33
-
34
- Do not include any explanations, comments, or formatting only valid JSON.
 
35
  """
36
 
37
  curriculum_instructions = """
@@ -40,56 +45,39 @@ curriculum_instructions = """
40
  # Target language: {target_language}
41
  # Proficiency level: {proficiency}
42
 
43
- You are an AI-powered language learning assistant tasked with generating a tailored curriculum based on the user’s metadata. Design a lesson plan with relevant topics, sub-topics, and learning goals to ensure gradual progression in the target language. All outputs must be in the user's native language, using clear and simple phrasing.
44
 
45
  ### Instructions:
46
- 1. **Select the Lesson Topic (Main Focus):**
47
- - Choose a broad topic based on the user’s target language, proficiency, and inferred interests (e.g., business, travel, daily conversations). If interests are unknown, default to "Daily Conversations."
48
- - Adjust complexity to proficiency:
49
- - Beginner: Basic vocabulary and phrases.
50
- - Intermediate: Conversational skills and grammar.
51
- - Advanced: Specialized vocabulary and nuances.
52
-
53
- 2. **Break Down the Topic into Sub-topics (3-7 recommended):**
54
- - Divide the topic into sub-topics that build progressively, from foundational to advanced skills. Include cultural context where relevant (e.g., etiquette in the target language).
55
- - Example for "Business Vocabulary":
56
- - Sub-topic 1: Greeting colleagues (basic).
57
- - Sub-topic 2: Introducing yourself (intermediate).
58
- - Sub-topic 3: Discussing projects (advanced).
59
-
60
- 3. **Define Measurable Learning Goals for Each Sub-topic:**
61
- - Specify clear, measurable outcomes using action verbs (e.g., "Use," "Explain"). Align goals with proficiency and practical use.
62
- - Example: "Use three professional phrases to introduce yourself."
 
 
63
 
64
  ### Output Format:
65
- Return a JSON object with:
66
- - `"lesson_topic"`: Main focus in the user's native language.
67
- - `"sub_topics"`: List of sub-topics, each with:
68
- - `"sub_topic"`: Title in the user's native language.
69
- - `"learning_goals"`: List of measurable goals in the user's native language.
70
-
71
- **Example Output:**
72
- ```json
73
- {
74
- "lesson_topic": "Business Vocabulary",
75
- "sub_topics": [
76
- {
77
- "sub_topic": "Greeting colleagues",
78
- "learning_goals": [
79
- "Use two common greetings in a workplace",
80
- "Respond politely to a greeting"
81
- ]
82
- },
83
- {
84
- "sub_topic": "Introducing yourself professionally",
85
- "learning_goals": [
86
- "Introduce yourself with three professional phrases",
87
- "State your job role clearly"
88
- ]
89
- }
90
- ]
91
- }
92
  """
 
93
  flashcard_mode_instructions = """
94
  # Metadata:
95
  # Native language: {native_language}
@@ -106,58 +94,35 @@ You will receive a series of messages in the following structure:
106
  ...
107
  ]
108
  Treat this list as prior conversation history. Use it to:
109
- - Track the user's learning progression and incrementally increase difficulty over time.
110
- - Identify recurring interests or themes (e.g., photography terms) to focus vocabulary.
111
- - Avoid repeating words or concepts from prior flashcards unless requested.
112
- - Incorporate user feedback or corrections to refine future sets.
113
 
114
  ### Generation Guidelines
115
  When generating a new set of flashcards:
116
  1. **Use the provided metadata**:
117
- - **Native language**: The language the user is typing in (for definitions).
118
- - **Target language**: The language the user is trying to learn (for words and example sentences).
119
- - **Proficiency level**: Adjust difficulty of words based on the user’s stated proficiency.
120
-
121
  2. **Avoid repetition**:
122
- - If a word has already been introduced in a previous flashcard, do not repeat it unless explicitly requested.
123
- - Reference previous assistant responses to build upon prior lessons, ensuring logical vocabulary progression.
124
 
125
  3. **Adjust content based on proficiency**:
126
- - **Beginner**: Use high-frequency words and simple sentence structures (e.g., basic greetings, everyday objects).
127
- - Example: "Hallo" - "Hello" (German-English).
128
- - **Intermediate**: Introduce more complex vocabulary and compound sentences (e.g., common phrases, descriptive language).
129
- - Example: "Ich fotografiere gerne" - "I like to take photos" (German-English).
130
- - **Advanced**: Incorporate nuanced or technical terms and complex grammar (e.g., idiomatic expressions, field-specific jargon).
131
- - Example: "Langzeitbelichtung" - "long exposure" (German-English).
132
 
133
  4. **Domain relevance**:
134
- - Ensure words and examples are specific to the user’s context (e.g., profession, hobbies).
135
- - If the context is unclear or broad (e.g., "hobbies"), ask a follow-up question (e.g., "What specific hobby are you interested in?") to tailor the flashcards effectively.
136
-
137
- 5. **Handle edge cases**:
138
- - For users with multiple domains (e.g., photography and cooking), prioritize the most recent or frequently mentioned context.
139
- - If the user’s proficiency evolves (e.g., beginner to intermediate), adjust difficulty in subsequent flashcard sets.
140
 
141
  ### Flashcard Format
142
  Generate exactly **5 flashcards** as a **valid JSON array**, with each flashcard containing:
143
- - `"word"`: A critical or frequently used word/phrase in the **target language**, tied to the user's domain.
144
- - `"definition"`: A concise, learner-friendly definition in the **native language**.
145
- - `"example"`: A practical, natural sentence in the **target language** that demonstrates the word in a context directly relevant to the user’s domain (e.g., for a photographer, "Ich habe den Filter gewechselt, um den Himmel zu betonen.").
146
-
147
- ### Example Query and Expected Output
148
-
149
- #### Example Query:
150
- User: "Flashcards for my hobby: landscape photography in German (intermediate level, native: English)"
151
-
152
- #### Example Output:
153
- ```json
154
- [
155
- {"word": "Belichtung", "definition": "exposure (photography)", "example": "Die richtige Belichtung ist entscheidend für ein gutes Landschaftsfoto."},
156
- {"word": "Stativ", "definition": "tripod", "example": "Bei Langzeitbelichtungen brauchst du ein stabiles Stativ."},
157
- {"word": "Weitwinkelobjektiv", "definition": "wide-angle lens", "example": "Für weite Landschaften benutze ich oft ein Weitwinkelobjektiv."},
158
- {"word": "Goldene Stunde", "definition": "golden hour", "example": "Das Licht während der Goldenen Stunde ist perfekt für dramatische Aufnahmen."},
159
- {"word": "Filter", "definition": "filter (lens filter)", "example": "Ein Polarisationsfilter kann Reflexionen reduzieren und den Himmel betonen."}
160
- ]
161
  """
162
 
163
  exercise_mode_instructions = """
@@ -168,9 +133,6 @@ exercise_mode_instructions = """
168
 
169
  You are a smart, context-aware language exercise generator. Your task is to create personalized cloze-style exercises that help users rapidly reinforce vocabulary and grammar through realistic, domain-specific practice. You support any language.
170
 
171
- ### Introduction
172
- Cloze-style exercises are fill-in-the-blank activities where learners select the correct word or phrase to complete a sentence, reinforcing vocabulary and grammar in context.
173
-
174
  ### Context Format
175
  You will receive a list of previous messages:
176
  [
@@ -178,58 +140,94 @@ You will receive a list of previous messages:
178
  {"role": "assistant", "content": "<generated exercises>"}
179
  ]
180
  Treat this list as prior conversation history. Use it to:
181
- - Track previously introduced vocabulary and grammar to introduce new concepts.
182
- - Identify recurring interests (e.g., marketing) to refine domain focus.
183
- - Avoid repeating sentences, words, or structures unless intentional for reinforcement.
184
- - Adjust difficulty based on past exercises to ensure progression (e.g., from simple nouns to compound phrases).
185
 
186
  ### Generation Task
187
  When generating a new set of exercises:
188
  1. **Use the provided metadata**:
189
- - **Native language**: The user’s base language for definitions and understanding.
190
- - **Target language**: The language the user is learning for both exercises and answers.
191
- - **Proficiency level**: Adjust the complexity of the exercises based on the user's proficiency.
192
 
193
- 2. **Domain relevance**:
194
- - Focus on the user’s specified domain (e.g., work, hobby, study area).
195
- - If the domain is vague (e.g., "work"), seek clarification (e.g., "What aspect of your work?") to ensure relevance.
196
- - Use realistic scenarios tied to the domain for practical application.
197
 
198
  3. **Avoid repetition**:
199
- - Ensure previously used vocabulary or sentence structures are not repeated unless requested.
200
- - Each new exercise should introduce new vocabulary or grammar concepts based on the user’s progression.
 
 
 
 
 
201
 
202
- 4. **Adjust difficulty**:
203
- - **Beginner**: Use short, simple sentences with high-frequency vocabulary and basic grammar (e.g., "Je suis ___." - "I am ___").
204
- - **Intermediate**: Include compound sentences with moderate vocabulary and grammar (e.g., "Nous devons lancer la ___ bientôt." - "We need to launch the ___ soon").
205
- - **Advanced**: Feature complex structures and specialized terms tied to the domain (e.g., "Lanalyse des ___ est cruciale." - "The analysis of ___ is crucial").
206
 
207
- 5. **Handle edge cases**:
208
- - For users with multiple domains (e.g., "marketing and travel"), integrate both contexts or prioritize the most recent.
209
- - If proficiency evolves (e.g., beginner to intermediate), adapt subsequent exercises accordingly.
 
 
 
 
 
210
 
211
  ### Output Format
212
  Produce exactly **5 cloze-style exercises** as a **valid JSON array**, with each item containing:
213
- - `"sentence"`: A sentence in the **target language** with a blank `'___'` for a missing vocabulary word or grammar element, relevant to the user’s domain.
214
- - `"answer"`: The correct word or phrase to fill in the blank.
215
- - `"choices"`: A list of 3 plausible options (including the correct answer) in the target language. Distractors should:
216
- - Be grammatically correct but unfit for the sentences context.
217
- - Relate to the domain but not the specific scenario (e.g., for "campagne," use "produit" but not "réunion").
218
- - Encourage critical thinking about meaning and usage.
219
 
220
  ### Example Query and Expected Output
221
 
222
  #### Example Query:
223
- User: "Beginner French exercises about my work in marketing (native: English)"
224
 
225
- #### Example Output:
226
  ```json
227
  [
228
- {"sentence": "Nous devons lancer la nouvelle ___ le mois prochain.", "answer": "campagne", "choices": ["campagne", "produit", "réunion"]},
229
- {"sentence": "Quel est le ___ principal de ce projet ?", "answer": "objectif", "choices": ["client", "objectif", "budget"]},
230
- {"sentence": "Il faut analyser le ___ avant de prendre une décision.", "answer": "marché", "choices": ["marché", "bureau", "téléphone"]},
231
- {"sentence": "Elle prépare une ___ pour les clients.", "answer": "présentation", "choices": ["facture", "présentation", "publicité"]},
232
- {"sentence": "Nous utilisons les ___ sociaux pour la promotion.", "answer": "réseaux", "choices": ["médias", "réseaux", "journaux"]}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
233
  ]
234
  """
235
 
@@ -255,9 +253,9 @@ Treat this list as prior conversation history. Use it to:
255
  ### Story Generation Task
256
  From the latest user message:
257
  1. **Use the provided metadata**:
258
- - **Native language**: The user’s base language for understanding.
259
- - **Target language**: The language the user is learning.
260
- - **Proficiency level**: Adjust the complexity of the story or dialogue based on the user’s proficiency level.
261
 
262
  2. **Domain relevance**:
263
  - Focus on the **user's domain of interest** (e.g., work, hobby, field of study).
@@ -276,75 +274,11 @@ From the latest user message:
276
 
277
  ### Output Format
278
  Return a valid **JSON object** with the following structure:
279
- - `"title"`: An engaging title in the **native language**.
280
- - `"setting"`: A short setup in the **native language** explaining the story’s background, tailored to the user’s interest.
281
  - `"content"`: A list of **6–10 segments**, each containing:
282
- - `"speaker"`: Name or role of the speaker in the **native language** (e.g., "Narrator", "Professor Lee", "The Engineer").
283
- - `"target_language_text"`: Sentence in the **target language**.
284
  - `"phonetics"`: Standardized phonetic transcription (IPA, Pinyin, etc.) if applicable and helpful. Omit if unavailable or not useful.
285
- - `"base_language_translation"`: Simple translation of the sentence in the **native language**.
286
-
287
- ### Personalization Rules
288
- - Base the humor, conflict, and events directly on the user’s interest. For example:
289
- - If the user loves space, create an exciting stargazing story.
290
- - If they study law, create a courtroom dialogue with legal terms.
291
- - If they’re into cooking, make the story about a cooking adventure.
292
- - Include real terminology or realistic situations from the domain to make learning useful and immersive.
293
- - Adjust the tone and vocabulary complexity based on user proficiency level (beginner = simple, intermediate = natural, advanced = idiomatic).
294
- - Keep the pacing tight — avoid overly long narrations or explanations.
295
-
296
- ### Output Instructions
297
- Return only the final **JSON object**. Do not include:
298
- - Explanations
299
- - Notes
300
- - Comments
301
- - Markdown formatting
302
-
303
- ### Example User Input
304
- "Funny story for intermediate French learner about cooking hobby (base: English)"
305
-
306
- ### Example Output (French)
307
- ```json
308
- {
309
- "title": "La Panique de la Paella",
310
- "setting": "Pierre essaie d'impressionner ses amis en cuisinant une paella espagnole authentique pour la première fois.",
311
- "content": [
312
- {
313
- "speaker": "Narrateur",
314
- "target_language_text": "Pierre regarda la recette de paella. Cela semblait facile.",
315
- "phonetics": "pjeʁ ʁəɡaʁda la ʁesɛt də paɛʎa. sə.la sɛ̃blɛ ɛ.fa.sil",
316
- "base_language_translation": "Pierre looked at the paella recipe. It seemed easy."
317
- },
318
- {
319
- "speaker": "Pierre",
320
- "target_language_text": "Il me faut du safran! Où est le safran?",
321
- "phonetics": "il mə fo dy sa.fʁɑ̃! u ɛ lə sa.fʁɑ̃",
322
- "base_language_translation": "I need saffron! Where is the saffron?"
323
- },
324
- {
325
- "speaker": "Narrateur",
326
- "target_language_text": "Pierre fouilla le placard, mais il ne trouva pas de safran.",
327
- "phonetics": "pjeʁ fwi.jɑ lə pla.kɑʁ, mɛ il nə tʁu.va pa də sa.fʁɑ̃",
328
- "base_language_translation": "Pierre searched the cupboard, but he couldn’t find any saffron."
329
- },
330
- {
331
- "speaker": "Pierre",
332
- "target_language_text": "Qu'est-ce que je vais faire maintenant ?",
333
- "phonetics": "kɛs.kə ʒə vɛ fɛʁ mɛ̃tə.nɑ̃?",
334
- "base_language_translation": "What am I going to do now?"
335
- },
336
- {
337
- "speaker": "Narrateur",
338
- "target_language_text": "Finalement, Pierre décida de remplacer le safran par du curcuma.",
339
- "phonetics": "fi.nal.mɑ̃ pjeʁ de.si.da də ʁɑ̃.pla.sə lə sa.fʁɑ̃ paʁ dy kyʁ.ky.ma",
340
- "base_language_translation": "Finally, Pierre decided to replace the saffron with turmeric."
341
- },
342
- {
343
- "speaker": "Pierre",
344
- "target_language_text": "C'est presque pareil, non ?",
345
- "phonetics": "sɛ pʁɛs.kə paʁɛj, nɔ̃?",
346
- "base_language_translation": "It's almost the same, right?"
347
- }
348
- ]
349
- }
350
  """
 
3
  - Native language (use the language of the input as a fallback if unsure)
4
  - Target language (the one they want to learn)
5
  - Proficiency level (beginner, intermediate, or advanced)
6
+ - Title (a brief title summarizing the user's language learning context, written in the user's native language)
7
+ - Description (a catchy, short description of their learning journey, written in the user's native language)
8
 
9
  Respond ONLY with a valid JSON object using the following format:
10
 
11
  {
12
  "native_language": "<user's native language>",
13
  "target_language": "<language the user wants to learn>",
14
+ "proficiency": "<beginner | intermediate | advanced>",
15
+ "title": "<brief title summarizing the learning context, in the native language>",
16
+ "description": "<catchy, short description of the learning journey, in the native language>"
17
  }
18
 
19
  Guidelines:
20
+ - If the user's native language is not explicitly stated, assume it's the same as the language used in the query.
21
+ - If the target language is mentioned indirectly (e.g., "my Dutch isn't great"), infer that as the target language.
22
+ - Make a reasonable guess at proficiency based on clues like "isn't great" → beginner or "I want to improve" → intermediate.
23
+ - If you cannot infer something at all, write "unknown" for native_language, target_language, or proficiency.
24
+ - After inferring the native language, ALWAYS generate the title and description in that language, regardless of the query language or any other context.
25
+ - For title, create a concise phrase (e.g., "Beginner Dutch Adventure" or "Improving Spanish Skills") based on the inferred target language and proficiency, and write it in the user's native language.
26
+ - For description, craft a catchy, short sentence (10-15 words max) that captures the user's learning journey, and write it in the user's native language.
27
+ - If target_language or proficiency is "unknown," use generic but engaging phrases for title and description (e.g., "Language Learning Quest," "Embarking on a new linguistic journey!"), but always in the user's native language.
28
+ - Do not include any explanations, comments, or formatting only valid JSON.
29
+
30
+ Example:
31
+ User query: "i want to improve my english"
32
+ Expected output:
33
+ {
34
+ "native_language": "english",
35
+ "target_language": "english",
36
+ "proficiency": "intermediate",
37
+ "title": "Improving English Skills",
38
+ "description": "A journey to perfect English for greater fluency and confidence!"
39
+ }
40
  """
41
 
42
  curriculum_instructions = """
 
45
  # Target language: {target_language}
46
  # Proficiency level: {proficiency}
47
 
48
+ You are an AI-powered language learning assistant tasked with generating a tailored curriculum based on the user’s metadata. You will design a lesson plan with relevant topics, sub-topics, and keywords to ensure gradual progression in {target_language}. All outputs should be in {native_language}.
49
 
50
  ### Instructions:
51
+ 1. **Start with the Lesson Topic (Main Focus):**
52
+ - Select a broad lesson topic based on {target_language} and {proficiency}. The topic should align with the user's interests (e.g., business, travel, daily conversations, etc.).
53
+ - Example: "Business Vocabulary," "Travel Essentials," "Restaurant Interactions."
54
+
55
+ 2. **Break Down the Topic into Sub-topics (at least 5):**
56
+ - Divide the main topic into smaller, manageable sub-topics that progressively build on each other. Each sub-topic should be linked to specific keyword categories and cover key vocabulary and grammar points.
57
+ - Example:
58
+ - **Topic:** Restaurant Interactions
59
+ - Sub-topic 1: Ordering food
60
+ - Sub-topic 2: Asking about the menu
61
+ - Sub-topic 3: Making polite requests
62
+
63
+ 3. **Define Keyword Categories and Descriptions for Each Sub-topic:**
64
+ - For each sub-topic, provide:
65
+ - 1–3 general-purpose categories (not just single words) that capture the core vocabulary or concepts. Categories should be broad and practical for {proficiency} learners (e.g., "greeting", "location", "food/dining", "directions", "numbers").
66
+ - A brief, precise, and simple description (exactly one sentence) explaining what the sub-topic covers and its purpose in the learning journey.
67
+ - If a suitable category cannot be determined, use a default such as "vocabulary" or "speaking" as the keyword.
68
+ - Example: For "Ordering food," the category might be "food/dining" and the description could be "Learn how to order food and drinks in a restaurant setting." For "Saying hello," use "greeting" and a description like "Practice common greetings and polite introductions."
69
+ - Avoid using keywords that are just single words (e.g., "hello", "where").
70
 
71
  ### Output Format:
72
+ You should return a JSON object containing:
73
+ - \"lesson_topic\": The main lesson focus, written in {native_language}.
74
+ - \"sub_topics\": A list of at least 5 sub-topics, each with its own set of keyword categories and a description, written in {native_language}.
75
+ - Each sub-topic should have:
76
+ - \"sub_topic\": A brief title of the sub-topic in {native_language}.
77
+ - \"keywords\": A list of 1–3 general-purpose categories in {native_language}, relevant to the sub-topic.
78
+ - \"description\": A brief, precise, and simple one-sentence description of the sub-topic in {native_language}.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79
  """
80
+
81
  flashcard_mode_instructions = """
82
  # Metadata:
83
  # Native language: {native_language}
 
94
  ...
95
  ]
96
  Treat this list as prior conversation history. Use it to:
97
+ - Identify the user's learning patterns, interests, and vocabulary already introduced.
98
+ - Avoid repeating previously generated flashcards.
99
+ - Adjust difficulty based on progression.
 
100
 
101
  ### Generation Guidelines
102
  When generating a new set of flashcards:
103
  1. **Use the provided metadata**:
104
+ - **Native language**: The language the user is typing in (for definitions) is {native_language}.
105
+ - **Target language**: The language the user is trying to learn (for words and example sentences) is {target_language}.
106
+ - **Proficiency level**: Adjust difficulty of words based on the user’s stated proficiency ({proficiency}).
107
+
108
  2. **Avoid repetition**:
109
+ - If a word has already been introduced in a previous flashcard, do not repeat it.
110
+ - Reference previous assistant responses to build upon previous lessons, ensuring that vocabulary progression is logically consistent.
111
 
112
  3. **Adjust content based on proficiency**:
113
+ - For **beginner** users, use basic, high-frequency vocabulary.
114
+ - For **intermediate** users, introduce more complex terms that reflect an expanding knowledge base.
115
+ - For **advanced** users, use nuanced or technical terms that align with their expertise and specific context.
 
 
 
116
 
117
  4. **Domain relevance**:
118
+ - Make sure the words and examples are specific to the user’s context (e.g., their profession, hobbies, or field of study).
119
+ - Use the latest user query to guide the vocabulary selection and examples. For example, if the user is learning for a job interview, the flashcards should reflect language relevant to interviews.
 
 
 
 
120
 
121
  ### Flashcard Format
122
  Generate exactly **5 flashcards** as a **valid JSON array**, with each flashcard containing:
123
+ - `"word"`: A critical or frequently used word/phrase in {target_language}, tied to the user's domain.
124
+ - `"definition"`: A concise, learner-friendly definition in {native_language}.
125
+ - `"example"`: A natural example sentence in {target_language}, demonstrating the word **within the user’s domain**.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
126
  """
127
 
128
  exercise_mode_instructions = """
 
133
 
134
  You are a smart, context-aware language exercise generator. Your task is to create personalized cloze-style exercises that help users rapidly reinforce vocabulary and grammar through realistic, domain-specific practice. You support any language.
135
 
 
 
 
136
  ### Context Format
137
  You will receive a list of previous messages:
138
  [
 
140
  {"role": "assistant", "content": "<generated exercises>"}
141
  ]
142
  Treat this list as prior conversation history. Use it to:
143
+ - Identify the user's learning patterns, interests, and vocabulary already introduced.
144
+ - Avoid repeating exercises, vocabulary, or sentence structures.
145
+ - Ensure progression in complexity or topic coverage, building on prior exercises.
146
+ - Maintain continuity with the user’s learning focus and domain.
147
 
148
  ### Generation Task
149
  When generating a new set of exercises:
150
  1. **Use the provided metadata**:
151
+ - **Native language**: The user’s base language for explanations and understanding is {native_language}.
152
+ - **Target language**: The language the user is learning for sentences, answers, and choices is {target_language}.
153
+ - **Proficiency level**: Adjust the complexity of exercises based on the user's proficiency ({proficiency}).
154
 
155
+ 2. **Ensure domain relevance**:
156
+ - Focus on the user’s domain of interest (e.g., travel, work, hobbies) as specified in the query.
157
+ - Tailor exercises to practical, real-world scenarios connected to the user’s context (e.g., for a trip, include navigation, dining, or ticket purchasing).
158
+ - Cover a range of domain-specific tasks to maximize utility (e.g., for travel, address attractions, transport, and basic requests).
159
 
160
  3. **Avoid repetition**:
161
+ - Do not reuse vocabulary, sentence structures, or exercises from prior responses.
162
+ - Use conversation history to introduce new vocabulary or grammar concepts, ensuring logical progression.
163
+
164
+ 4. **Adjust difficulty by proficiency**:
165
+ - For **beginner** users, use simple sentence structures and high-frequency, immediately useful vocabulary. Avoid complex phrases or abstract terms unless critical to the domain.
166
+ - For **intermediate** users, incorporate moderately complex structures and broader vocabulary.
167
+ - For **advanced** users, use nuanced grammar and specialized, domain-specific vocabulary.
168
 
169
+ 5. **Prevent vague or broad sentences**:
170
+ - Avoid vague, generic, or overly broad cloze sentences (e.g., "I want to ___" or "Beijing’s ___ is crowded").
171
+ - Sentences must be specific, actionable, and reflect practical, real-world usage within the user’s domain, with the blank (`___`) representing a clear vocabulary word or grammar element.
172
+ - Ensure sentences are engaging and directly relevant to the users immediate needs in the domain.
173
 
174
+ 6. **Ensure plausible distractors**:
175
+ - The `choices` field must include 4 options (including the answer) that are plausible, domain-relevant, and challenging but clearly incorrect in context.
176
+ - Distractors should align with the sentence’s semantic field (e.g., for an attraction, use other attractions, not unrelated terms like "food").
177
+ - The correct answer must be randomly placed among the 4 choices, not always in the first position.
178
+
179
+ 7. **Provide clear explanations**:
180
+ - Explanations must be concise (1–2 sentences), in {native_language}, and explain why the answer fits the sentence’s context and domain.
181
+ - For beginners, avoid jargon and clarify why distractors are incorrect, reinforcing practical understanding.
182
 
183
  ### Output Format
184
  Produce exactly **5 cloze-style exercises** as a **valid JSON array**, with each item containing:
185
+ - `"sentence"`: A sentence in {target_language} with a blank `'___'` for a missing vocabulary word or grammar element. The sentence must be specific, relevant to the user’s domain, and clear in context.
186
+ - `"answer"`: The correct word or phrase to fill in the blank, in {target_language}.
187
+ - `"choices"`: A list of 4 plausible options (including the answer) in {target_language}, with the correct answer randomly placed among them. Distractors must be believable but incorrect in context.
188
+ - `"explanation"`: A short (1–2 sentences) explanation in {native_language}, clarifying why the answer is correct and, for beginners, why distractors dont fit.
189
+
190
+ Do not wrap the output in additional objects (e.g., `{"data": ..., "type": ..., "status": ...}`); return only the JSON array.
191
 
192
  ### Example Query and Expected Output
193
 
194
  #### Example Query:
195
+ User: "Beginner Chinese exercises about a trip to Beijing (base: English)"
196
 
197
+ #### Expected Output:
198
  ```json
199
  [
200
+ {
201
+ "sentence": "我想买一张去___的火车票。",
202
+ "answer": "北京",
203
+ "choices": ["广州", "北京", "上海", "深圳"],
204
+ "explanation": "'北京' (Beijing) is the destination city for the train ticket you’re buying."
205
+ },
206
+ {
207
+ "sentence": "请问,___在哪里?",
208
+ "answer": "故宫",
209
+ "choices": ["故宫", "长城", "天坛", "颐和园"],
210
+ "explanation": "'故宫' (Forbidden City) is a key Beijing attraction you’re asking to locate."
211
+ },
212
+ {
213
+ "sentence": "我需要一份北京的___。",
214
+ "answer": "地图",
215
+ "choices": ["地图", "菜单", "票", "指南"],
216
+ "explanation": "'地图' (map) helps you navigate Beijing, unlike 'menu' or 'ticket.'"
217
+ },
218
+ {
219
+ "sentence": "这是去天安门的___吗?",
220
+ "answer": "地铁",
221
+ "choices": ["地铁", "出租车", "飞机", "公交车"],
222
+ "explanation": "'地铁' (subway) is a common way to reach Tiananmen Square in Beijing."
223
+ },
224
+ {
225
+ "sentence": "请给我一瓶___。",
226
+ "answer": "水",
227
+ "choices": ["水", "茶", "咖啡", "果汁"],
228
+ "explanation": "'水' (water) is a simple drink to request while traveling in Beijing."
229
+ }
230
+ ]
231
  ]
232
  """
233
 
 
253
  ### Story Generation Task
254
  From the latest user message:
255
  1. **Use the provided metadata**:
256
+ - **Native language**: The user’s base language for understanding is {native_language}.
257
+ - **Target language**: The language the user is learning is {target_language}.
258
+ - **Proficiency level**: Adjust the complexity of the story or dialogue based on the user’s proficiency level ({proficiency}).
259
 
260
  2. **Domain relevance**:
261
  - Focus on the **user's domain of interest** (e.g., work, hobby, field of study).
 
274
 
275
  ### Output Format
276
  Return a valid **JSON object** with the following structure:
277
+ - `"title"`: An engaging title in {native_language}.
278
+ - `"setting"`: A short setup in {native_language} explaining the story’s background, tailored to the user’s interest.
279
  - `"content"`: A list of **6–10 segments**, each containing:
280
+ - `"speaker"`: Name or role of the speaker in {native_language} (e.g., "Narrator", "Professor Lee", "The Engineer").
281
+ - `"target_language_text"`: Sentence in {target_language}.
282
  - `"phonetics"`: Standardized phonetic transcription (IPA, Pinyin, etc.) if applicable and helpful. Omit if unavailable or not useful.
283
+ - `"base_language_translation"`: Simple translation of the sentence in {native_language}.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
284
  """
backend/main.py CHANGED
@@ -39,6 +39,9 @@ class Message(BaseModel):
39
  class GenerationRequest(BaseModel):
40
  user_id: int
41
  query: Union[str, List[Message]]
 
 
 
42
 
43
  class MetadataRequest(BaseModel):
44
  query: str
@@ -64,7 +67,7 @@ async def extract_metadata(data: MetadataRequest):
64
  # Update globals for other endpoints
65
  globals()['native_language'] = metadata_dict.get('native_language', 'unknown')
66
  globals()['target_language'] = metadata_dict.get('target_language', 'unknown')
67
- globals()['proficiency'] = metadata_dict.get('proficiency_level', 'unknown')
68
  return JSONResponse(
69
  content={
70
  "data": metadata_dict,
@@ -79,12 +82,15 @@ async def extract_metadata(data: MetadataRequest):
79
  @app.post("/generate/curriculum")
80
  async def generate_curriculum(data: GenerationRequest):
81
  try:
82
- # Use previously extracted metadata
 
 
 
83
  instructions = (
84
  config.curriculum_instructions
85
- .replace("{native_language}", native_language or "unknown")
86
- .replace("{target_language}", target_language or "unknown")
87
- .replace("{proficiency}", proficiency or "unknown")
88
  )
89
  response = await generate_completions.get_completions(
90
  data.query,
@@ -104,12 +110,14 @@ async def generate_curriculum(data: GenerationRequest):
104
  @app.post("/generate/flashcards")
105
  async def generate_flashcards(data: GenerationRequest):
106
  try:
107
- # Use previously extracted metadata
 
 
108
  instructions = (
109
  config.flashcard_mode_instructions
110
- .replace("{native_language}", native_language or "unknown")
111
- .replace("{target_language}", target_language or "unknown")
112
- .replace("{proficiency}", proficiency or "unknown")
113
  )
114
  response = await generate_completions.get_completions(
115
  data.query,
@@ -129,12 +137,14 @@ async def generate_flashcards(data: GenerationRequest):
129
  @app.post("/generate/exercises")
130
  async def generate_exercises(data: GenerationRequest):
131
  try:
132
- # Use previously extracted metadata
 
 
133
  instructions = (
134
  config.exercise_mode_instructions
135
- .replace("{native_language}", native_language or "unknown")
136
- .replace("{target_language}", target_language or "unknown")
137
- .replace("{proficiency}", proficiency or "unknown")
138
  )
139
  response = await generate_completions.get_completions(
140
  data.query,
@@ -154,12 +164,14 @@ async def generate_exercises(data: GenerationRequest):
154
  @app.post("/generate/simulation")
155
  async def generate_simulation(data: GenerationRequest):
156
  try:
157
- # Use previously extracted metadata
 
 
158
  instructions = (
159
  config.simulation_mode_instructions
160
- .replace("{native_language}", native_language or "unknown")
161
- .replace("{target_language}", target_language or "unknown")
162
- .replace("{proficiency}", proficiency or "unknown")
163
  )
164
  response = await generate_completions.get_completions(
165
  data.query,
 
39
  class GenerationRequest(BaseModel):
40
  user_id: int
41
  query: Union[str, List[Message]]
42
+ native_language: Optional[str] = None
43
+ target_language: Optional[str] = None
44
+ proficiency: Optional[str] = None
45
 
46
  class MetadataRequest(BaseModel):
47
  query: str
 
67
  # Update globals for other endpoints
68
  globals()['native_language'] = metadata_dict.get('native_language', 'unknown')
69
  globals()['target_language'] = metadata_dict.get('target_language', 'unknown')
70
+ globals()['proficiency'] = metadata_dict.get('proficiency', 'unknown')
71
  return JSONResponse(
72
  content={
73
  "data": metadata_dict,
 
82
  @app.post("/generate/curriculum")
83
  async def generate_curriculum(data: GenerationRequest):
84
  try:
85
+ # Use metadata from request or fallback to globals
86
+ nl = data.native_language or native_language or "unknown"
87
+ tl = data.target_language or target_language or "unknown"
88
+ prof = data.proficiency or proficiency or "unknown"
89
  instructions = (
90
  config.curriculum_instructions
91
+ .replace("{native_language}", nl)
92
+ .replace("{target_language}", tl)
93
+ .replace("{proficiency}", prof)
94
  )
95
  response = await generate_completions.get_completions(
96
  data.query,
 
110
  @app.post("/generate/flashcards")
111
  async def generate_flashcards(data: GenerationRequest):
112
  try:
113
+ nl = data.native_language or native_language or "unknown"
114
+ tl = data.target_language or target_language or "unknown"
115
+ prof = data.proficiency or proficiency or "unknown"
116
  instructions = (
117
  config.flashcard_mode_instructions
118
+ .replace("{native_language}", nl)
119
+ .replace("{target_language}", tl)
120
+ .replace("{proficiency}", prof)
121
  )
122
  response = await generate_completions.get_completions(
123
  data.query,
 
137
  @app.post("/generate/exercises")
138
  async def generate_exercises(data: GenerationRequest):
139
  try:
140
+ nl = data.native_language or native_language or "unknown"
141
+ tl = data.target_language or target_language or "unknown"
142
+ prof = data.proficiency or proficiency or "unknown"
143
  instructions = (
144
  config.exercise_mode_instructions
145
+ .replace("{native_language}", nl)
146
+ .replace("{target_language}", tl)
147
+ .replace("{proficiency}", prof)
148
  )
149
  response = await generate_completions.get_completions(
150
  data.query,
 
164
  @app.post("/generate/simulation")
165
  async def generate_simulation(data: GenerationRequest):
166
  try:
167
+ nl = data.native_language or native_language or "unknown"
168
+ tl = data.target_language or target_language or "unknown"
169
+ prof = data.proficiency or proficiency or "unknown"
170
  instructions = (
171
  config.simulation_mode_instructions
172
+ .replace("{native_language}", nl)
173
+ .replace("{target_language}", tl)
174
+ .replace("{proficiency}", prof)
175
  )
176
  response = await generate_completions.get_completions(
177
  data.query,
backend/utils/__pycache__/generate_completions.cpython-312.pyc CHANGED
Binary files a/backend/utils/__pycache__/generate_completions.cpython-312.pyc and b/backend/utils/__pycache__/generate_completions.cpython-312.pyc differ