metadata

library_name: sklearn
tags:
  - sklearn
  - skops
  - tabular-classification
model_format: pickle
model_file: model.joblib
widget:
  structuredData:
    LegalName:
      - TECH LAKE SYSTEMS SPÓŁKA Z OGRANICZONĄ ODPOWIEDZIALNOŚCIĄ
      - Radosław Wiśniewski wspólnik spółki cywilnej Agenda
      - SINOGRAF SPÓŁKA AKCYJNA

Model description

[More Information Needed]

Intended uses & limitations

[More Information Needed]

Training Procedure

Hyperparameters

The model is trained with below hyperparameters.

Click to expand

Hyperparameter	Value
memory
steps	[('feature_extraction', ColumnTransformer(transformers=[('abbreviations', <__main__.ELFAbbreviationTransformer object at 0x7f38e1fc3310>, 0), ('tokenizer', CountVectorizer(binary=True, lowercase=False, tokenizer=<function tokenize at 0x7f38e9fd4a60>), 0)])), ('classifier', ComplementNB())]
verbose	False
feature_extraction	ColumnTransformer(transformers=[('abbreviations', <__main__.ELFAbbreviationTransformer object at 0x7f38e1fc3310>, 0), ('tokenizer', CountVectorizer(binary=True, lowercase=False, tokenizer=<function tokenize at 0x7f38e9fd4a60>), 0)])
classifier	ComplementNB()
feature_extraction__n_jobs
feature_extraction__remainder	drop
feature_extraction__sparse_threshold	0.3
feature_extraction__transformer_weights
feature_extraction__transformers	[('abbreviations', <__main__.ELFAbbreviationTransformer object at 0x7f38e1fc3310>, 0), ('tokenizer', CountVectorizer(binary=True, lowercase=False, tokenizer=<function tokenize at 0x7f38e9fd4a60>), 0)]
feature_extraction__verbose	False
feature_extraction__verbose_feature_names_out	True
feature_extraction__abbreviations	<__main__.ELFAbbreviationTransformer object at 0x7f38e1fc3310>
feature_extraction__tokenizer	CountVectorizer(binary=True, lowercase=False, tokenizer=<function tokenize at 0x7f38e9fd4a60>)
feature_extraction__abbreviations__elf_abbreviations	<__main__.ELFAbbreviations object at 0x7f38e1f10220>
feature_extraction__abbreviations__jurisdiction	PL
feature_extraction__abbreviations__use_endswith	True
feature_extraction__abbreviations__use_lowercasing	True
feature_extraction__tokenizer__analyzer	word
feature_extraction__tokenizer__binary	True
feature_extraction__tokenizer__decode_error	strict
feature_extraction__tokenizer__dtype	<class 'numpy.int64'>
feature_extraction__tokenizer__encoding	utf-8
feature_extraction__tokenizer__input	content
feature_extraction__tokenizer__lowercase	False
feature_extraction__tokenizer__max_df	1.0
feature_extraction__tokenizer__max_features
feature_extraction__tokenizer__min_df	1
feature_extraction__tokenizer__ngram_range	(1, 1)
feature_extraction__tokenizer__preprocessor
feature_extraction__tokenizer__stop_words
feature_extraction__tokenizer__strip_accents
feature_extraction__tokenizer__token_pattern	(?u)\b\w\w+\b
feature_extraction__tokenizer__tokenizer	<function tokenize at 0x7f38e9fd4a60>
feature_extraction__tokenizer__vocabulary
classifier__alpha	1.0
classifier__class_prior
classifier__fit_prior	True
classifier__norm	False

Model Plot

The model plot is below.

Pipeline(steps=[('feature_extraction',ColumnTransformer(transformers=[('abbreviations',<__main__.ELFAbbreviationTransformer object at 0x7f38e1fc3310>,0),('tokenizer',CountVectorizer(binary=True,lowercase=False,tokenizer=<function tokenize at 0x7f38e9fd4a60>),0)])),('classifier', ComplementNB())])

Please rerun this cell to show the HTML repr or trust the notebook.

Evaluation Results

You can find the details about evaluation process and the evaluation results.

Metric	Value
f1	0.971647
f1 macro	0.522164

How to Get Started with the Model

[More Information Needed]

Model Card Authors

This model card is written by following authors:

[More Information Needed]

Model Card Contact

You can contact the model card authors through following channels: [More Information Needed]

Citation

Below you can find information related to citation.

BibTeX:

[More Information Needed]