cmykk commited on
Commit
58c280a
·
verified ·
1 Parent(s): 0a24faa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -3
README.md CHANGED
@@ -1,3 +1,38 @@
1
- ---
2
- license: llama2
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama2
3
+ language:
4
+ - hu
5
+ ---
6
+
7
+ Base Model:
8
+ https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
9
+
10
+ ---
11
+
12
+ Model fine-tuned on a real news dataset and optimized for neural news generation.
13
+
14
+ Note: Hungarian was not in pretraining.
15
+
16
+ ```python
17
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
18
+
19
+ # Load model and tokenizer
20
+ tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
21
+ model = AutoModelForSequenceClassification.from_pretrained('tum-nlp/neural-news-llama-2-7b-chat-hu')
22
+
23
+ # Create the pipeline for neural news generation and set the repetition penalty >1.1 to punish repetition.
24
+ generator = pipeline('text-generation',
25
+ model=model,
26
+ tokenizer=tokenizer,
27
+ repetition_penalty=1.2)
28
+
29
+ # Define the prompt
30
+ prompt = "Cím: Ellenzéki politikai akció az ügyészséggel szemben Cikk: Az ügyészség visszautasítja az igazságszolgáltatást ért politikai nyomásgyakorlást – tájékoztatott [EOP]"
31
+
32
+ # Generate
33
+ generator(prompt, max_length=1000, num_return_sequences=1)
34
+
35
+ ```
36
+
37
+ Trained on 6k datapoints (including all splits) from:
38
+ https://github.com/batubayk/news_datasets?tab=readme-ov-file