Hugging Face Transformers with Scikit-learn Classifiers π€©π
This repository contains a small proof-of-concept pipeline that leverages longformer embeddings with scikit-learn Logistic Regression that does sentiment analysis. The training leverages the language module of whatlies. See the tutorial notebook here.
Classification Report π
Below is the classification report ππ»
              precision    recall  f1-score   support
           0       0.85      0.89      0.87       522
           1       0.89      0.85      0.87       550
    accuracy                           0.87      1072
   macro avg       0.87      0.87      0.87      1072
weighted avg       0.87      0.87      0.87      1072
Pipeline π
Below you can see the pipeline ππ» (it's interactive! πͺ)
Pipeline(steps=[('embedding',\n                 HFTransformersLanguage(model_name_or_path='facebook/bart-base')),\n                ('model', LogisticRegression())])Please rerun this cell to show the HTML repr or trust the notebook.Pipeline(steps=[('embedding',\n                 HFTransformersLanguage(model_name_or_path='facebook/bart-base')),\n                ('model', LogisticRegression())])HFTransformersLanguage(model_name_or_path='facebook/bart-base')
LogisticRegression()
Hyperparameters β€οΈ
You can find hyperparameters below ππ»β¨
{'memory': None,
 'steps': [('embedding',
   HFTransformersLanguage(model_name_or_path='facebook/bart-base')),
  ('model', LogisticRegression())],
 'verbose': False,
 'embedding': HFTransformersLanguage(model_name_or_path='facebook/bart-base'),
 'model': LogisticRegression(),
 'embedding__model_name_or_path': 'facebook/bart-base',
 'model__C': 1.0,
 'model__class_weight': None,
 'model__dual': False,
 'model__fit_intercept': True,
 'model__intercept_scaling': 1,
 'model__l1_ratio': None,
 'model__max_iter': 100,
 'model__multi_class': 'auto',
 'model__n_jobs': None,
 'model__penalty': 'l2',
 'model__random_state': None,
 'model__solver': 'lbfgs',
 'model__tol': 0.0001,
 'model__verbose': 0,
 'model__warm_start': False}
