metadata

license: apache-2.0
tags:
  - text-classification
  - generic
library_name: generic

Hugging Face Transformers with Scikit-learn Classifiers 🤩🌟

This repository contains a small proof-of-concept pipeline that leverages longformer embeddings with scikit-learn Logistic Regression that does sentiment analysis. The training leverages the language module of whatlies.

Classification Report

Below is the classification report 👇🏻

           precision    recall  f1-score   support
           0       0.84      0.89      0.86        53
           1       0.86      0.81      0.84        47    
    accuracy                           0.85       100   
    macro avg      0.85      0.85      0.85       100
    weighted avg   0.85      0.85      0.85       100

Pipeline

Below you can see the pipeline 👇🏻 (it's interactive! 🪄)

Pipeline(steps=[('embedding',\n                 HFTransformersLanguage(model_name_or_path='allenai/longformer-base-4096')),\n                ('model', LogisticRegression())])

Please rerun this cell to show the HTML repr or trust the notebook.

Hyperparameters

-'memory': None,
-'steps': [('embedding', HFTransformersLanguage(model_name_or_path='allenai/longformer-base-4096')),
          ('model', LogisticRegression())],
 - 'verbose': False,
 -'embedding': HFTransformersLanguage(model_name_or_path='allenai/longformer-base-4096'),
 -'model': LogisticRegression(),
 -'embedding_model_name_or_path': 'allenai/longformer-base-4096',
 -'model_C': 1.0,
 - 'model_class_weight': None,
 - 'model_dual': False,
 - 'model_fit_intercept': True,
 - 'model_intercept_scaling': 1,
 - 'model_l1_ratio': None,
 - 'model_max_iter': 100,
 - 'model_multi_class': 'auto',
 -'model_n_jobs': None,
 -'model_penalty': 'l2',
 -'model_random_state': None,
 -'model_solver': 'lbfgs',
 -'model_tol': 0.0001,
 -'model_verbose': 0,
 -'model_warm_start': False