hunterhector
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -4,6 +4,11 @@ language:
|
|
4 |
- en
|
5 |
pipeline_tag: text-generation
|
6 |
library_name: transformers
|
|
|
|
|
|
|
|
|
|
|
7 |
tags:
|
8 |
- llm
|
9 |
- code
|
@@ -36,7 +41,8 @@ We also performed the same finetuning on the last **CrystalCoder** checkpoint of
|
|
36 |
|
37 |
# Instruction Tuning Data
|
38 |
|
39 |
-
The fine-tuning data is a mix of publicly available language and code datasets, plus a orginally created dataset called **WebAlpaca
|
|
|
40 |
|
41 |
The summary of the fine-tuning data is as follows:
|
42 |
|
@@ -57,8 +63,6 @@ The summary of the fine-tuning data is as follows:
|
|
57 |
| [Programming Books](https://huggingface.co/datasets/open-phi/programming_books_llama) | 395,628,544 | Not instruction data
|
58 |
| Total | 1,102,516,224
|
59 |
|
60 |
-
The HTML Instruction dataset was curated by LLM360 and will be made available shortly.
|
61 |
-
|
62 |
For more details, check out the [data table](https://huggingface.co/LLM360/CrystalChat/blob/main/data_table.jpg).
|
63 |
|
64 |
# Instruction Format
|
|
|
4 |
- en
|
5 |
pipeline_tag: text-generation
|
6 |
library_name: transformers
|
7 |
+
widget:
|
8 |
+
- example_title: "Write python function and explain"
|
9 |
+
text: "<s> <|sys_start|> You are an AI assistant. You will be given a task. You must generate a detailed and long answer. <|sys_end|> <|im_start|> Write a python function that takes a list of integers and returns the squared sum of the list. <|im_end|>"
|
10 |
+
output:
|
11 |
+
text: "Response:\n````\nHere's a Python function named `squared_sum_list` that takes a list of integers as input and returns the squared sum of the list:\n\n```python\ndef squared_sum_list(lst):\n\treturn sum([num ** 2 for num in lst])\n```\nThe function `squared_sum_list` uses a list comprehension to iterate over each number in the input list `lst` and calculate its square. Then, it uses the `sum` function to accumulate all the squared numbers in a single value - the squared sum of the list.\n\nFor example:\n\n```python\nnumbers = [1, 2, 3, 4, 5]\nprint(squared_sum_list(numbers)) # Outputs: 55\n```\n\nIn the above code, the list `[1, 2, 3, 4, 5]` is passed as an argument to the `squared_sum_list` function. The function calculates the sum of the squares of the elements in the list, which is `1 + 4 + 9 + 16 + 25 = 55`. The function then returns this result, which is printed to the console.</s>"
|
12 |
tags:
|
13 |
- llm
|
14 |
- code
|
|
|
41 |
|
42 |
# Instruction Tuning Data
|
43 |
|
44 |
+
The fine-tuning data is a mix of publicly available language and code datasets, plus a orginally created dataset called **WebAlpaca** on HTML coding instructions.
|
45 |
+
The WebAlpaca dataset is created by us and is used as part of our instruction tuning training data. We will release the WebAlpaca dataset in a separate repository soon.
|
46 |
|
47 |
The summary of the fine-tuning data is as follows:
|
48 |
|
|
|
63 |
| [Programming Books](https://huggingface.co/datasets/open-phi/programming_books_llama) | 395,628,544 | Not instruction data
|
64 |
| Total | 1,102,516,224
|
65 |
|
|
|
|
|
66 |
For more details, check out the [data table](https://huggingface.co/LLM360/CrystalChat/blob/main/data_table.jpg).
|
67 |
|
68 |
# Instruction Format
|