Model description

This model is an attempt to solve the 2025 FrugalAI challenge. Nice.

Intended uses & limitations

Better than random label assignment, still room for improvement.

Training Procedure

Trained with a lot of care

Hyperparameters

Click to expand

Hyperparameter	Value
memory
steps	[('lemmatizer', FunctionTransformer(func=<function lemmatize_X at 0x7f2c3cd63ca0>)), ('tfidf', TfidfVectorizer(max_df=0.95, min_df=2, stop_words=['if', 'when', 'most', 'ourselves', 'your', 'having', "didn't", '@', "you've", 'hasn', 'at', "mightn't", "mustn't", 'these', "it's", 'our', 'had', 'll', 'too', 'this', 'by', 'it', 'further', 'wasn', 'before', 'all', '{', 'herself', 'other', 'above', ...], tokenizer=<function tokenize_quote at 0x7f2c3cdaea60>)), ('rf', RandomForestClassifier())]
transform_input
verbose	False
lemmatizer	FunctionTransformer(func=<function lemmatize_X at 0x7f2c3cd63ca0>)
tfidf	TfidfVectorizer(max_df=0.95, min_df=2, stop_words=['if', 'when', 'most', 'ourselves', 'your', 'having', "didn't", '@', "you've", 'hasn', 'at', "mightn't", "mustn't", 'these', "it's", 'our', 'had', 'll', 'too', 'this', 'by', 'it', 'further', 'wasn', 'before', 'all', '{', 'herself', 'other', 'above', ...], tokenizer=<function tokenize_quote at 0x7f2c3cdaea60>)
rf	RandomForestClassifier()
lemmatizer__accept_sparse	False
lemmatizer__check_inverse	True
lemmatizer__feature_names_out
lemmatizer__func	<function lemmatize_X at 0x7f2c3cd63ca0>
lemmatizer__inv_kw_args
lemmatizer__inverse_func
lemmatizer__kw_args
lemmatizer__validate	False
tfidf__analyzer	word
tfidf__binary	False
tfidf__decode_error	strict
tfidf__dtype	<class 'numpy.float64'>
tfidf__encoding	utf-8
tfidf__input	content
tfidf__lowercase	True
tfidf__max_df	0.95
tfidf__max_features
tfidf__min_df	2
tfidf__ngram_range	(1, 1)
tfidf__norm	l2
tfidf__preprocessor
tfidf__smooth_idf	True
tfidf__stop_words	['if', 'when', 'most', 'ourselves', 'your', 'having', "didn't", '@', "you've", 'hasn', 'at', "mightn't", "mustn't", 'these', "it's", 'our', 'had', 'll', 'too', 'this', 'by', 'it', 'further', 'wasn', 'before', 'all', '{', 'herself', 'other', 'above', 'needn', 'than', 'i', 'not', 'was', 'few', 'both', 'd', 'now', 'has', ')', '&', '`', 'who', 'whom', '"', 'through', 'me', 'myself', '>', 'and', "'", 'which', 've', 'were', 'aren', 'doesn', 'that', '
tfidf__strip_accents
tfidf__sublinear_tf	False
tfidf__token_pattern	(?u)\b\w\w+\b
tfidf__tokenizer	<function tokenize_quote at 0x7f2c3cdaea60>
tfidf__use_idf	True
tfidf__vocabulary
rf__bootstrap	True
rf__ccp_alpha	0.0
rf__class_weight
rf__criterion	gini
rf__max_depth
rf__max_features	sqrt
rf__max_leaf_nodes
rf__max_samples
rf__min_impurity_decrease	0.0
rf__min_samples_leaf	1
rf__min_samples_split	2
rf__min_weight_fraction_leaf	0.0
rf__monotonic_cst
rf__n_estimators	100
rf__n_jobs
rf__oob_score	False
rf__random_state
rf__verbose	0
rf__warm_start	False

Model Plot

Pipeline(steps=[('lemmatizer',FunctionTransformer(func=<function lemmatize_X at 0x7f2c3cd63ca0>)),('tfidf',TfidfVectorizer(max_df=0.95, min_df=2,stop_words=['if', 'when', 'most', 'ourselves','your', 'having', "didn't", '@',"you've", 'hasn', 'at', "mightn't","mustn't", 'these', "it's", 'our','had', 'll', 'too', 'this', 'by','it', 'further', 'wasn', 'before','all', '{', 'herself', 'other','above', ...],tokenizer=<function tokenize_quote at 0x7f2c3cdaea60>)),('rf', RandomForestClassifier())])

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

Evaluation Results

Metric	Value
accuracy	0.5873666940114848
f1_score	0.5666496543166571
super_config	this works! even with arguments 2

How to Get Started with the Model

[More Information Needed]

Model Card Authors

This model card is written by following authors:

[More Information Needed]

Model Card Contact

You can contact the model card authors through following channels: [More Information Needed]

Citation

Below you can find information related to citation.

BibTeX:

[More Information Needed]

A lot of info

Does this work?

kantundpeterpan
/

skopush-test