SetFit with BAAI/bge-small-en-v1.5

This is a SetFit model that can be used for Text Classification. This SetFit model uses BAAI/bge-small-en-v1.5 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: BAAI/bge-small-en-v1.5
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 4 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
1	"b'nasfund National Superannuation fund Ltd\n\n \n\n \n\nP.O. Box 5791 Boroko PNG\nTelephone: (675) 313 1813 PURCHASE ORDER\nEmail:\nSupplier Details:- Order No: PF007347\nWaterfront Foodworld Requested by: Spencer Kaba\nP.O Box 889 Contact No:\nKonedebu NCD\nPapua New Guinea Date Issued: 29-Feb-2024\nSupplier No:\nDelivery Date: 29-Feb-2024\nPage: 1 of 1\nAttention: Leoba\nDeliver To: Invoice To:\nNational Superannuation fund Ltd\nBSP Haus Poreporena Freeway\nLevel 4\nBoroko\nDescription Qty. Unit oa a ee au\n1 Tea & Office Supplies for NASFUND HQ 1 ALL 3,345.70 3,345.70\nLevels 3 & 4. Quote No.WFQ.No:2402.148.\nOrder Total PGK : 3,345.70\n\nSignature Mi WU\n\nApproved By: Maureen ABABA 29-Feb-2024\nRequisitioned By: Niasul KISOKAU 29-Feb-2024\n\x0c" "b'nasfund National Superannuation fund Ltd\nP.\nnisin\nEmail:\n\nKPMG Chartered Accountants\n\nSupplier Details:-\n\nP O Box 507\nPapua New Guinea\n\nAttention: Jennifer Avaeape\n\nDeliver To:\n\nDescription\n\n1 Audit of Nasfund Y/E 31/1\n\n5h /\nso \xe2\x80\x98h J\n\n \n\n. Box 5791\n\n \n\nBoroko PNG\n(675) 313 1813\n\nPURCHASE ORDER\n\nOrder No: PF006849\nRequested by: Debbie Oli\nContact No:\n\nDate Issued: 05-Sep-2023\nSupplier No: 10063\nDelivery Date: 05-Oct-2023\nPage: 1 of 1\n\nInvoice To:\n\nNational Superannuation fund Ltd\nBSP Haus Poreporena Freeway\n\nLevel 4\n; Unit Price Amt Incl\nary. Unit Incl GST GST\n2/2023 0 ONLY 0.00 492,250.00\na Order Total PGK : 492,250.00\n\nSignature ER\n\nApproved By:DEBBIE OLI 06-Sep-2022\nRequisitioned By: Niasul KISOKAU 05-Sep-202\xc2\xa2\n\n \n\x0c'" "b'nasfund National Superannuation fund Ltd\n\nP.O. Box 5791 Boroko PNG\nTelephone: (675) 313 1813 PURCHASE ORDER\nEmail\nSupplier Details:- Order No: PF007324\nTheodist Pty Ltd Requested by: Niasul.K.Lillie\nP O Box 1618 Contact No:\nBoroko NCD 111\nPapua New Guinea Date Issued: 15-Feb-2024\nSupplier No: 10127\nDelivery Date: 16-Mar-2024\nPage: 1 of 1\nAttention: Rhoda Kunnopi\nDeliver To: Invoice To:\n\nNational Superannuation fund Ltd\nBSP Haus Poreporena Freeway\n\n \n\nLevel 4\nBoroko\nDescription ay. Unit See iegr et\n1 Supply Binding Clear Covers& Hard Covers 1 ALL 1,290.01 1,290.01\nfor NSF Head Office.\nQuote No.3584563.\nOrder Total PGK : 1,290.01\n\nSignature Uy rT\n\nApproved By: Maureen ABABA 15-Feb-2024\nRequisitioned By: Niasul KISOKAU 15-Feb-2024\n\x0c'"
3	"b'Port Moresby Lae Hagen\n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\n \n\nWa P.O.BOX 1618, BOROKO P.O.BOX 2507, LAE Building C, Unit 1, Section 06\nAc, ra! fl NCD, PNG MOROBE, PNG Mt Hagen, WHP, PNG\nPhone: (675) 313 9800 Phone: (675) 472 5488 Phone: (675) 7528 7200\nHOEODISos Phone: (675) 72321300 Phone: (675) 7054 4494 Phone: (675) 7590 5096\nBUSINESS SUPERSTORE Seles@theodist:com.pa: a saleslae@theodist. ore Sa ee eee ae\nGST REG NO 377 TIN NO 500000599\nStatement For:\nNATIONAL SUPERANNUATION FUND Statement Date: 29/02/2024\nLIMITED Account: NASFUND\nPO BOX 5791\nBOROKO\nNATIONAL CAPITAL DISTRICT STATEMENT\nPAPUA NEW GUINEA\nPh: 325 9522\nDate Doc # Reference Type Amount Running Balance\n30-JUN-2023
0	"b' \n\nBUSINESS SUPERSTORE\n\nPort Moresby\n\nNCD, PNG\n\nPhone:\nPhone:\n\nTAX INVOICE\n\nP.O.BOX 1618, BOROKO\n\n(675) 313 9800\n(675) 7232 1300\nsales@theodist.com.pg\n\nGST REG NO 377.\n57\n\nLae\n\nP.O.BOX 2507, LAE\n\nMOROBE, PNG\n\n(675) 472 5488\n(675) 7054 4494\n\n \n \n\nee\n\nPhone:\nPhone:\nsaleslae@theodist.com.pg\n\nTIN.NO 500000599 ,\n\nJiasul\n\nos Ole
2	"b'nasfund*_\n\n12 May 2023\n\nFlora Kwapena\n\nDirector/Registered Valuer #123 (PNG)\nProperty PNG Limited\n\nP.O Box 1067, Boroko\n\nNCD\n\nPapua New Guinea\n\nBy Email: support@propertypngitd.com\n\nDear Florence,\nRE: ENGAGEMENT TO PROVIDE INDEPENDENT VALUATION SERVICES\n\nWe refer to your bid proposal dated 10 March 2023 and are pleased to confirm the\nengagement of your firm to undertake an independent valuation for the properties as\n\n \n\n \n\nfollows:\nProperty Property Description Quoted Price\nCredit House 1x 7-levels high-end

Evaluation

Metrics

Label	Accuracy
all	1.0

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Gopal2002/NASFUND_MODEL")
# Run inference
preds = model("b'nasfund \\& National Superannuation fund Ltd\n\n \n\nP.O. Box 5791 Boroko PNG\nTelephone: (675) 313 1813 PURCHASE ORDER\nEmail:\nSupplier Details:- Order No: PF006716\nProperty PNG Requested by: Gareth Kobua\nP.O.Box 1067 Contact No:\nPapua New Guinea\nDate Issued: 25-Jul-2023\nSupplier No: 00469\nDelivery Date: 25-Jul-2023\nPage: 1 of 1\nAttention :\nDeliver To: Invoice To:\nNational Superannuation fund Ltd\nBSP Haus Poreporena Freeway\nLevel 4\nDescription Qty. Unit at ee i\n1 Service Fee for the External Property 0 ONLY 0.00 30,000.00\nValuation Service for Credit Corp. Property Portfolio.\nOrder Total PGK : 30,000.00\n\nApproved By: Nathan KWARARA 25-Jul-2023\nRequisitioned By: Niasul KISOKAU 25-Jul-2023\n\nSignature\n\x0c'")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	68	191.1579	417

Label	Training Sample Count
0	4
1	4
2	5
3	6

Training Hyperparameters

batch_size: (32, 32)
num_epochs: (3, 3)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.1111	1	0.3014	-

Framework Versions

Python: 3.10.12
SetFit: 1.0.3
Sentence Transformers: 2.7.0
Transformers: 4.40.0
PyTorch: 2.2.1+cu121
Datasets: 2.19.0
Tokenizers: 0.19.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Gopal2002
/

NASFUND_MODEL