AntX-ai
/

AntX-13B

+---
+license: apache-2.0
+datasets:
+- BAAI/COIG-PC
+language:
+- zh
+library_name: transformers
+pipeline_tag: text-generation
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+This is an experimental product that can be used to create new LLM bassed on Chinese language.
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** yjf9966
+- **Model type:** LLaMA with enhanced tokenizer-size-49954
+- **Language(s) (NLP):** Chinese/English
+- **License:** Apache-2.0
+- **Finetuned from model:** [Chinese-LLaMA-Alpaca](https://github.com/ymcui/Chinese-LLaMA-Alpaca)
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** https://huggingface.co/AntX-ai/AntX-13B
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+You can use the raw model for next sentence prediction, but it's mostly intended to be fine-tuned on a downstream task.
+Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification or question answering.
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+Even if the training data used for this model could be characterized as fairly neutral, this model can have biased predictions.
+It also inherits some of the bias of its dataset model.
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+```python
+from transformers import LlamaForCausalLM, LlamaTokenizer
+import torch
+base_model_name = "AntX-ai/AntX-13B"
+load_type = torch.float16
+device = None
+generation_config = dict(
+    temperature=0.2,
+    top_k=40,
+    top_p=0.9,
+    do_sample=True,
+    num_beams=1,
+    repetition_penalty=1.3,
+    max_new_tokens=400
+    )
+prompt_input = (
+    "Below is an instruction that describes a task. "
+    "Write a response that appropriately completes the request.\n\n"
+    "### Instruction:\n\n{instruction}\n\n### Response:\n\n"
+)
+if torch.cuda.is_available():
+    device = torch.device(0)
+else:
+    device = torch.device('cpu')
+def generate_prompt(instruction, input=None):
+    if input:
+        instruction = instruction + '\n' + input
+    return prompt_input.format_map({'instruction': instruction})
+tokenizer = LlamaTokenizer.from_pretrained(base_model_name)
+model = LlamaForCausalLM.from_pretrained(
+        base_model_name,
+        load_in_8bit=False,
+        torch_dtype=load_type,
+        low_cpu_mem_usage=True,
+        device_map='auto',
+        )
+model_vocab_size = model.get_input_embeddings().weight.size(0)
+tokenzier_vocab_size = len(tokenizer)
+if model_vocab_size != tokenzier_vocab_size:
+    model.resize_token_embeddings(tokenzier_vocab_size)
+raw_input_text = input("Input:")
+input_text = generate_prompt(instruction=raw_input_text)
+inputs = tokenizer(input_text, return_tensors="pt")
+generation_output = model.generate(
+input_ids=inputs["input_ids"].to(device),
+    attention_mask=inputs['attention_mask'].to(device),
+    eos_token_id=tokenizer.eos_token_id,
+    pad_token_id=tokenizer.pad_token_id,
+    **generation_config
+)
+s = generation_output[0]
+output = tokenizer.decode(s, skip_special_tokens=True)
+response = output.split("### Response:")[1].strip()
+print("Response: ", response)
+print("\n")
+```
+## Training Details
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+80% for train dataset and 20% for test dataset
+#### Training Hyperparameters
+- **Training regime:** fp16 mixed precision, lr=1e-4, lora_rank=8, lora_alpha=32 <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+## Evaluation
+#### Testing Data
+<!-- This should link to a Data Card if possible. -->
+20% of the BAAI/COIG-PC dataset.
+```
+Input:王国维说：“自周之衰，文王、周公势力之瓦解也，国民之智力成熟于内，政治之纷乱乘之于外，上无统一之制度，下迫于社会之要求，于是诸于九流各创其学说。” 他意在说明 A. 分封制的崩溃 B. 商鞅变法的作用 C. 兼并战争的后果 D. 百家争鸣的原因
+Response:  本题考查对材料的理解。A错误；B正确；C和D与材料无关。故选BC两项即可
+Input:经济基础是指��定社会发展阶段占统治地位的生产关系各个方面的总和。解决了地方经济问题， 也就解 决了地方割据问题， 为此， 宋太祖采取的措施是（ ） A ． 地方设转运使， 财赋收归中央 B ． 派文臣担任各地州县的长官 C ． 派文臣管理地方政事 D ． 采取分化事权办法， 削弱相权
+Response:  A： 本题考查对宋太祖治下地方问题的认识。 A : 依据材料可知, 在北宋时期 , 由于地主阶级的发展壮大以及商业、手工业等新兴行业出现 , 这一时期出现的地方割据现象严重威胁着国家的统一与稳定 . 因此 , 为了解决这个问题， 需要加强中央集权 ; 故选 A
+Input:go\\n/**\\n * Definition for a binary tree node.\\n * type TreeNode struct {\\n *     Val int\\n *     Left *TreeNode\\n *     Right *TreeNode\\n * }\\n */\\nfunc maxDepth(root *TreeNode) int {\\n\\tif root == nil {\\n\\t\\treturn 0\\n\\t}\\n\\tl, r := maxDepth(root.Left), maxDepth(root.Right)\\n\\treturn 1 + max(l, r)\\n}\\n\\nfunc max(a, b int) int {\\n\\tif a > b {\\n\\t\\treturn a\\n\\t}\\n\\treturn b\\n}\\n
+Response:  ```python
+def max_depth(root):  # The function takes in a single parameter `root` and returns its maximum depth value as output.
+    if not root or len(root.children()) == 0:
+        return 0
+    l = max_depth(root.left)
+    r = max_depth(root.right)
+    return 1 + (max(l, r))
+class TreeNode:
+    def __init__(self, val=None, left=10, right=0):
+        self.val = val
+        self.left = None
+        self.right = None
+```
+## Citation
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+```
+@software{AntX-13B,
+      title={An Enchanced Chinese Language Model based on the Chinese-LLaMA-Alpaca},
+      url={https://huggingface.co/AntX-ai/AntX-13B},
+      year={2023}
+}
+```