zhenggq commited on
Commit
7552945
1 Parent(s): d800ca0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +110 -14
README.md CHANGED
@@ -6,21 +6,35 @@ pipeline_tag: text-generation
6
 
7
  <!-- Provide a quick summary of what the model is/does. -->
8
 
9
- In Orca 2, we continue exploring how improved training signals can give smaller LMs enhanced reasoning abilities, typically
10
- found only in much larger models. We seek to teach small LMs to employ different solution
11
- strategies for different tasks, potentially different from the one used by the
12
- larger model. For example, while larger models might provide a direct answer
13
- to a complex task, smaller models may not have the same capacity. In Orca
14
- 2, we teach the model various reasoning techniques (step-by-step, recall
15
- then generate, recall-reason-generate, direct answer, etc.). More crucially,
16
- we aim to help the model learn to determine the most effective solution
17
- strategy for each task. Orca 2 models were trained by continual training of LLaMA-2 base models of the same size.
18
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  ## Model Details
21
 
22
  Refer to LLaMA-2 for details on model architectures.
23
 
 
 
 
 
 
 
 
 
 
 
24
  ## Uses
25
 
26
 
@@ -82,8 +96,90 @@ This model is solely designed for research settings, and its testing has only be
82
  out in such environments. It should not be used in downstream applications, as additional
83
  analysis is needed to assess potential harm or bias in the proposed application.
84
 
85
- ## How to Get Started with the Model
86
-
87
- Use the code below to get started with the model.
88
-
89
- [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
  <!-- Provide a quick summary of what the model is/does. -->
8
 
9
+ Orca is a helpful assistant that is built for research purposes only and provides a single turn response
10
+ in tasks such as reasoning over user given data, reading comprehension, math problem solving and text summarization.
11
+ The model is designed to excel particularly in reasoning.
 
 
 
 
 
 
12
 
13
+ We open-source Orca to encourage further research on the development, evaluation, and alignment of smaller LMs.
14
+
15
+ ## What is Orca’s intended use(s)?
16
+
17
+ + Orca is built for research purposes only.
18
+ + The main purpose is to allow the research community to assess its abilities and to provide a foundation for building better frontier models.
19
+
20
+ ## How was Orca evaluated?
21
+
22
+ + Orca has been evaluated on a large number of tasks ranging from reasoning to safety. Please refer to Sections 6, 7, 8, 9, 10, and 11 in the paper for details about different evaluation experiments.
23
 
24
  ## Model Details
25
 
26
  Refer to LLaMA-2 for details on model architectures.
27
 
28
+ Orca is a finetuned version of LLAMA-2. Orca’s training data is a synthetic dataset that was created to enhance the small model’s reasoning abilities. All synthetic training data was filtered using the Azure content filters.
29
+ More details about the model can be found at: LINK to Tech Report
30
+
31
+ ## License
32
+
33
+ The model is licensed under the Microsoft Research License.
34
+
35
+ Llama 2 is licensed under the LLAMA 2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.
36
+
37
+
38
  ## Uses
39
 
40
 
 
96
  out in such environments. It should not be used in downstream applications, as additional
97
  analysis is needed to assess potential harm or bias in the proposed application.
98
 
99
+ ## Getting started with Orca 2
100
+
101
+ **Safe inference with Azure AI Content Safety**
102
+
103
+ The usage of Azure AI Content Safety on top of model prediction is strongly encouraged
104
+ and can help prevent content harms. Azure AI Content Safety is a content moderation platform
105
+ that uses AI to keep your content safe. By integrating Orca with Azure AI Content Safety,
106
+ we can moderate the model output by scanning it for sexual content, violence, hate, and
107
+ self-harm with multiple severity levels and multi-lingual detection.
108
+
109
+ ```python
110
+ import os
111
+ import math
112
+ import transformers
113
+ import torch
114
+
115
+ from azure.ai.contentsafety import ContentSafetyClient
116
+ from azure.core.credentials import AzureKeyCredential
117
+ from azure.core.exceptions import HttpResponseError
118
+ from azure.ai.contentsafety.models import AnalyzeTextOptions
119
+
120
+ CONTENT_SAFETY_KEY = os.environ["CONTENT_SAFETY_KEY"]
121
+ CONTENT_SAFETY_ENDPOINT = os.environ["CONTENT_SAFETY_ENDPOINT"]
122
+
123
+ # We use Azure AI Content Safety to filter out any content that reaches "Medium" threshold
124
+ # For more information: https://learn.microsoft.com/en-us/azure/ai-services/content-safety/
125
+ def should_filter_out(input_text, threshold=4):
126
+ # Create an Content Safety client
127
+ client = ContentSafetyClient(CONTENT_SAFETY_ENDPOINT, AzureKeyCredential(CONTENT_SAFETY_KEY))
128
+
129
+ # Construct a request
130
+ request = AnalyzeTextOptions(text=input_text)
131
+
132
+ # Analyze text
133
+ try:
134
+ response = client.analyze_text(request)
135
+ except HttpResponseError as e:
136
+ print("Analyze text failed.")
137
+ if e.error:
138
+ print(f"Error code: {e.error.code}")
139
+ print(f"Error message: {e.error.message}")
140
+ raise
141
+ print(e)
142
+ raise
143
+
144
+ categories = ["hate_result", "self_harm_result", "sexual_result", "violence_result"]
145
+ max_score = -math.inf
146
+ for category in categories:
147
+ max_score = max(max_score, getattr(response, category).severity)
148
+
149
+ return max_score >= threshold
150
+
151
+ def run_inference(model_path, inputs):
152
+ device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
153
+ model = transformers.AutoModelForCausalLM.from_pretrained(model_path)
154
+ model.to(device)
155
+
156
+ tokenizer = transformers.AutoTokenizer.from_pretrained(
157
+ model_path,
158
+ model_max_length=4096,
159
+ padding_side="right",
160
+ use_fast=False,
161
+ add_special_tokens=False,
162
+ )
163
+ inputs = tokenizer(inputs, return_tensors='pt')
164
+ inputs = inputs.to(device)
165
+
166
+ output_ids = model.generate(inputs["input_ids"], max_length=4096, do_sample=False, temperature=0.0, use_cache=True)
167
+ sequence_length = inputs["input_ids"].shape[1]
168
+ new_output_ids = output_ids[:, sequence_length:]
169
+ answers = tokenizer.batch_decode(new_output_ids, skip_special_tokens=True)
170
+
171
+ return answers
172
+
173
+ model_path = 'microsoft/Orca-2-7b'
174
+
175
+ system_message = "You are Orca, an AI language model created by Microsoft. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior."
176
+ user_message = "\" \n :You can't just say, \"\"that's crap\"\" and remove it without gaining a consensus. You already know this, based on your block history. —/ \" \nIs the comment obscene? \nOptions : Yes, No."
177
+
178
+ # We use Chat Markup Language https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/ai-services/openai/includes/chat-markup-language.md#working-with-chat-markup-language-chatml
179
+ prompt = f"<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{user_message}<|im_end|>\n<|im_start|>assistant"
180
+
181
+ answers = run_inference(model_path, prompt)
182
+ final_output = answers[0] if not should_filter_out(answers[0]) else "[Content Filtered]"
183
+
184
+ print(final_output)
185
+ ```