Model Card for Model ID

A model fine-tuned for structured information extraction (IE) specifically for political elites.

Model Details

Model Description

This is an early fine-tuned version of NuExtract 2.0-8B, which was originally a fine-tuned version of Qwen2.5-VL-7B-Instruct for structured information extraction (IE). Particularly, the target task involve joint named entity recognition (NER) and relation extraction (RE) to identify & extract information about politicla elites, their educational and professional associations, events and timeframes, and family members. The extracted information is generated in a structured JSON output. The fine-tuning process is adopted from NuMind team's procedure (numind/NuExtract-2.0-8B). Data for the fine-tuning comes from 2 sources: (1) mannual collection and (2) synthetic data generated by GPT-4.

  • Developed by: Tu Eric Ngo
  • Language(s) (NLP): English
  • Finetuned from model [optional]: numind/NuExtract-2.0-8B

Model Sources [optional]

Uses

The model is fine-tuned to structured information extraction from political elite biographies in a very specific way. It follows a particular template that is very specific to the author's research project. The actual JSON schema and prompt for this fine-tuned task will be published in the future

Out-of-Scope Use

While the fine-tuned model may be able to perform similar structured IE tasks (especially for the simpler tasks with simpler JSON schema), the model is only trained with a specific task in mind. However, in the future, the author intends to expand the range of structured IE tasks that the model can be used for.

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

  • Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Downloads last month
2
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tu-ericngo/NuExtract-StructuredIE-v1.2.2

Quantizations
1 model

Paper for tu-ericngo/NuExtract-StructuredIE-v1.2.2