DeepMount00 commited on
Commit
7a360e1
โ€ข
1 Parent(s): a174bbb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -112
README.md CHANGED
@@ -1,116 +1,61 @@
 
1
 
2
- <head>
3
- <style>
4
- body {
5
- font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
6
- line-height: 1.6;
7
- max-width: 800px;
8
- margin: 0 auto;
9
- padding: 20px;
10
- background-color: #f5f5f5;
11
- }
12
- .container {
13
- background-color: white;
14
- padding: 30px;
15
- border-radius: 10px;
16
- box-shadow: 0 2px 4px rgba(0,0,0,0.1);
17
- }
18
- h1 {
19
- color: #2c3e50;
20
- border-bottom: 2px solid #3498db;
21
- padding-bottom: 10px;
22
- margin-bottom: 30px;
23
- }
24
- h2 {
25
- color: #2980b9;
26
- margin-top: 25px;
27
- }
28
- .features-list {
29
- background-color: #f8f9fa;
30
- padding: 20px;
31
- border-radius: 5px;
32
- border-left: 4px solid #3498db;
33
- }
34
- .citation {
35
- background-color: #f8f9fa;
36
- padding: 15px;
37
- border-radius: 5px;
38
- font-family: monospace;
39
- white-space: pre-wrap;
40
- }
41
- .performance {
42
- margin: 20px 0;
43
- }
44
- .limitations {
45
- background-color: #fff3f3;
46
- padding: 20px;
47
- border-radius: 5px;
48
- border-left: 4px solid #e74c3c;
49
- }
50
- .requirements {
51
- background-color: #f0f9ff;
52
- padding: 20px;
53
- border-radius: 5px;
54
- border-left: 4px solid #2ecc71;
55
- }
56
- </style>
57
- </head>
58
- <body>
59
- <div class="container">
60
- <h1>๐Ÿค– Alireo-400M Model Card</h1>
61
-
62
- <h2>๐Ÿ“ Model Description</h2>
63
- <p>Alireo-400M is a lightweight yet powerful Italian language model with 400M parameters, designed to provide efficient natural language processing capabilities while maintaining a smaller footprint compared to larger models.</p>
64
-
65
- <h2>โœจ Key Features</h2>
66
- <div class="features-list">
67
- <ul>
68
- <li>๐Ÿ—๏ธ <strong>Architecture:</strong> Transformer-based language model</li>
69
- <li>๐Ÿ“Š <strong>Parameters:</strong> 400M</li>
70
- <li>๐ŸชŸ <strong>Context Window:</strong> 8K tokens</li>
71
- <li>๐Ÿ“š <strong>Training Data:</strong> Curated Italian text corpus (books, articles, web content)</li>
72
- <li>๐Ÿ’พ <strong>Model Size:</strong> ~800MB</li>
73
- </ul>
74
- </div>
75
-
76
- <h2>๐Ÿ“ˆ Performance</h2>
77
- <div class="performance">
78
- <p>Despite its compact size, Alireo-400M demonstrates impressive performance:</p>
79
- <ul>
80
- <li>๐Ÿ† Outperforms Qwen 0.5B across multiple benchmarks</li>
81
- <li>๐ŸŽฏ Maintains high accuracy in Italian language understanding tasks</li>
82
- <li>โšก Efficient inference speed due to optimized architecture</li>
83
- </ul>
84
- </div>
85
-
86
- <h2>โš ๏ธ Limitations</h2>
87
- <div class="limitations">
88
- <ul>
89
- <li>Limited context window compared to larger models</li>
90
- <li>May struggle with highly specialized technical content</li>
91
- <li>Performance may vary on dialectal variations</li>
92
- <li>Not suitable for multilingual tasks</li>
93
- </ul>
94
- </div>
95
-
96
- <h2>๐Ÿ’ป Hardware Requirements</h2>
97
- <div class="requirements">
98
- <ul>
99
- <li>๐ŸŽฎ <strong>Minimum RAM:</strong> 2GB</li>
100
- <li>๐Ÿ’ช <strong>Recommended RAM:</strong> 4GB</li>
101
- <li>๐ŸŽจ <strong>GPU:</strong> Optional, but recommended for faster inference</li>
102
- <li>๐Ÿ’ฟ <strong>Disk Space:</strong> ~1GB (including model and dependencies)</li>
103
- </ul>
104
- </div>
105
-
106
- <h2>๐Ÿ“œ License</h2>
107
- <p>Apache 2.0</p>
108
-
109
- <h2>๐Ÿ“„ Citation</h2>
110
- <div class="citation">@software{alireo2024,
111
  author = {[Michele Montebovi]},
112
  title = {Alireo-400M: A Lightweight Italian Language Model},
113
  year = {2024},
114
- }</div>
115
- </div>
116
- </body>
 
1
+ # Alireo-400M Model Card ๐Ÿ“š
2
 
3
+ ## Model Description
4
+ Alireo-400M is a lightweight yet powerful Italian language model with 400M parameters, designed to provide efficient natural language processing capabilities while maintaining a smaller footprint compared to larger models.
5
+
6
+ ## Key Features โœจ
7
+ * **Architecture**: Transformer-based language model ๐Ÿ—๏ธ
8
+ * **Parameters**: 400M ๐Ÿ“Š
9
+ * **Context Window**: 8K tokens ๐ŸชŸ
10
+ * **Training Data**: Curated Italian text corpus (books, articles, web content) ๐Ÿ“š
11
+ * **Model Size**: ~800MB ๐Ÿ’พ
12
+
13
+ ## Performance ๐Ÿ“ˆ
14
+ Despite its compact size, Alireo-400M demonstrates impressive performance:
15
+
16
+ * **Benchmark Results**: Outperforms Qwen 0.5B across multiple benchmarks ๐Ÿ†
17
+ * **Language Understanding**: Maintains high accuracy in Italian language understanding tasks ๐ŸŽฏ
18
+ * **Speed**: Efficient inference speed due to optimized architecture โšก
19
+
20
+ ## Limitations โš ๏ธ
21
+ * Limited context window compared to larger models
22
+ * May struggle with highly specialized technical content
23
+ * Performance may vary on dialectal variations
24
+ * Not suitable for multilingual tasks
25
+
26
+ ## Hardware Requirements ๐Ÿ’ป
27
+ * **Minimum RAM**: 2GB
28
+ * **Recommended RAM**: 4GB
29
+ * **GPU**: Optional, but recommended for faster inference
30
+ * **Disk Space**: ~1GB (including model and dependencies)
31
+
32
+ ## Usage Example
33
+
34
+ ```python
35
+ from transformers import AutoModelForCausalLM, AutoTokenizer
36
+
37
+ # Load model and tokenizer
38
+ model = AutoModelForCausalLM.from_pretrained("montebovi/alireo-400m")
39
+ tokenizer = AutoTokenizer.from_pretrained("montebovi/alireo-400m")
40
+
41
+ # Example text
42
+ text = "L'intelligenza artificiale sta"
43
+
44
+ # Tokenize and generate
45
+ inputs = tokenizer(text, return_tensors="pt")
46
+ outputs = model.generate(**inputs, max_new_tokens=50)
47
+ result = tokenizer.decode(outputs[0], skip_special_tokens=True)
48
+ print(result)
49
+ ```
50
+
51
+ ## License ๐Ÿ“œ
52
+ Apache 2.0
53
+
54
+ ## Citation ๐Ÿ“„
55
+ ```bibtex
56
+ @software{alireo2024,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  author = {[Michele Montebovi]},
58
  title = {Alireo-400M: A Lightweight Italian Language Model},
59
  year = {2024},
60
+ }
61
+ ```