--- license: apache-2.0 tags: - text-classification - generic - notebook-favorites library_name: generic --- ## Hugging Face Transformers with Scikit-learn Classifiers 🤩🌟 This repository contains a small proof-of-concept pipeline that leverages longformer embeddings with scikit-learn Logistic Regression that does sentiment analysis. The training leverages the language module of [whatlies](https://github.com/koaning/whatlies). See the tutorial notebook [here](https://www.kaggle.com/code/unofficialmerve/scikit-learn-with-transformers/notebook). # Classification Report 📈 Below is the classification report 👇🏻 ``` precision recall f1-score support 0 0.85 0.89 0.87 522 1 0.89 0.85 0.87 550 accuracy 0.87 1072 macro avg 0.87 0.87 0.87 1072 weighted avg 0.87 0.87 0.87 1072 ``` # Pipeline 🌟 Below you can see the pipeline 👇🏻 (it's interactive! 🪄)
Pipeline(steps=[('embedding',\n                 HFTransformersLanguage(model_name_or_path='facebook/bart-base')),\n                ('model', LogisticRegression())])
Please rerun this cell to show the HTML repr or trust the notebook.
# Hyperparameters ❤️ You can find hyperparameters below 👇🏻✨ ``` {'memory': None, 'steps': [('embedding', HFTransformersLanguage(model_name_or_path='facebook/bart-base')), ('model', LogisticRegression())], 'verbose': False, 'embedding': HFTransformersLanguage(model_name_or_path='facebook/bart-base'), 'model': LogisticRegression(), 'embedding__model_name_or_path': 'facebook/bart-base', 'model__C': 1.0, 'model__class_weight': None, 'model__dual': False, 'model__fit_intercept': True, 'model__intercept_scaling': 1, 'model__l1_ratio': None, 'model__max_iter': 100, 'model__multi_class': 'auto', 'model__n_jobs': None, 'model__penalty': 'l2', 'model__random_state': None, 'model__solver': 'lbfgs', 'model__tol': 0.0001, 'model__verbose': 0, 'model__warm_start': False} ```