--- library_name: model2vec license: mit model_name: tmpqsu1ee6a tags: - embeddings - static-embeddings datasets: - HuggingFaceFW/fineweb-edu-llama3-annotations language: - en base_model: - minishlab/potion-base-8M --- # potion-8m-edu-classifier Model Card This [Model2Vec](https://github.com/MinishLab/model2vec) model is a fine-tuned version of [potion-base-8m](https://huggingface.co/minishlab/potion-base-8M). It was trained to predict educational content, analogous to how the [fineweb-edu-classifier](https://huggingface.co/HuggingFaceFW/fineweb-edu-classifier) was used to filter educational content. It achieves the following performance on the evaluation split: ``` precision recall f1-score support 0 0.70 0.42 0.52 5694 1 0.75 0.86 0.80 26512 2 0.55 0.51 0.53 10322 3 0.54 0.45 0.49 3407 4 0.59 0.30 0.40 807 5 0.00 0.00 0.00 1 accuracy 0.69 46743 macro avg 0.52 0.42 0.46 46743 weighted avg 0.68 0.69 0.68 46743 ``` When thresholded to a binary classifier, it achieves a macro-averaged F1-score of `0.79`. The original classifier achieves `0.81` on the same dataset, but this classifier is orders of magnitude faster on CPU. ``` precision recall f1-score support not edu 0.96 0.98 0.97 42528 edu 0.70 0.54 0.61 4215 accuracy 0.94 46743 macro avg 0.83 0.76 0.79 46743 weighted avg 0.93 0.94 0.93 46743 ``` ## Installation Install model2vec with the inference extra using pip: ``` pip install model2vec[inference] ``` ## Usage Load this model using the `from_pretrained` method: ```python from model2vec.inference import StaticModelPipeline # Load a pretrained Model2Vec model model = StaticModelPipeline.from_pretrained("minishlab/potion-8m-edu-classifier") # Predict labels label = model.predict(["Example sentence"]) ``` ## Library Authors Model2Vec was developed by [Minish](https://github.com/MinishLab). ## Citation Please cite the [Model2Vec repository](https://github.com/MinishLab/model2vec) if you use this model in your work. ``` @software{minishlab2024model2vec, authors = {Stephan Tulkens, Thomas van Dongen}, title = {Model2Vec: Turn any Sentence Transformer into a Small Fast Model}, year = {2024}, url = {https://github.com/MinishLab/model2vec}, } ```