liamcripwell commited on
Commit
8521c8e
1 Parent(s): b5f53bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -15,7 +15,7 @@ base_model: Qwen/Qwen2.5-0.5B
15
 
16
  # NuExtract-tiny-v1.5 by NuMind 🔥
17
 
18
- NuExtract-v1.5 is a fine-tuning of [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B), trained on a private high-quality dataset for structured information extraction. It supports long documents and several languages (English, French, Spanish, German, Portuguese, and Italian).
19
  To use the model, provide an input text and a JSON template describing the information you need to extract.
20
 
21
  Note: This model is trained to prioritize pure extraction, so in most cases all text generated by the model is present as is in the original text.
@@ -58,7 +58,7 @@ def predict_NuExtract(model, tokenizer, texts, template, batch_size=1, max_lengt
58
 
59
  return [output.split("<|output|>")[1] for output in outputs]
60
 
61
- model_name = "numind/NuExtract-v1.5"
62
  device = "cuda"
63
  model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, trust_remote_code=True).to(device).eval()
64
  tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
 
15
 
16
  # NuExtract-tiny-v1.5 by NuMind 🔥
17
 
18
+ NuExtract-tiny-v1.5 is a fine-tuning of [Qwen/Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B), trained on a private high-quality dataset for structured information extraction. It supports long documents and several languages (English, French, Spanish, German, Portuguese, and Italian).
19
  To use the model, provide an input text and a JSON template describing the information you need to extract.
20
 
21
  Note: This model is trained to prioritize pure extraction, so in most cases all text generated by the model is present as is in the original text.
 
58
 
59
  return [output.split("<|output|>")[1] for output in outputs]
60
 
61
+ model_name = "numind/NuExtract-tiny-v1.5"
62
  device = "cuda"
63
  model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, trust_remote_code=True).to(device).eval()
64
  tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)