metadata

base_model: bert-base-multilingual-uncased
datasets:
  - joelniklaus/lextreme
license: apache-2.0
tags:
  - embedding_space_map
  - BaseLM:bert-base-multilingual-uncased

ESM joelniklaus/lextreme

Model Details

Model Description

ESM

Developed by: David Schulte
Model type: ESM
Base Model: bert-base-multilingual-uncased
Intermediate Task: joelniklaus/lextreme
ESM architecture: linear
Language(s) (NLP): [More Information Needed]
License: Apache-2.0 license

Training Details

Intermediate Task

Task ID: joelniklaus/lextreme
Subset [optional]: swiss_judgment_prediction
Text Column: input
Label Column: label
Dataset Split: train
Sample size [optional]: 10000
Sample seed [optional]: 42

Training Procedure [optional]

Language Model Training Hyperparameters [optional]

Epochs: 3
Batch size: 32
Learning rate: 2e-05
Weight Decay: 0.01
Optimizer: AdamW

ESM Training Hyperparameters [optional]

Epochs: 10
Batch size: 32
Learning rate: 0.001
Weight Decay: 0.01
Optimizer: AdamW

Additional trainiung details [optional]

Model evaluation

Evaluation of fine-tuned language model [optional]

Evaluation of ESM [optional]

MSE:

Additional evaluation details [optional]

What are Embedding Space Maps?

Embedding Space Maps (ESMs) are neural networks that approximate the effect of fine-tuning a language model on a task. They can be used to quickly transform embeddings from a base model to approximate how a fine-tuned model would embed the the input text. ESMs can be used for intermediate task selection with the ESM-LogME workflow.

How can I use Embedding Space Maps for Intermediate Task Selection?

We release hf-dataset-selector, a Python package for intermediate task selection using Embedding Space Maps.

hf-dataset-selector fetches ESMs for a given language model and uses it to find the best dataset for applying intermediate training to the target task. ESMs are found by their tags on the Huggingface Hub.

from hfselect import Dataset, compute_task_ranking

# Load target dataset from the Hugging Face Hub
dataset = Dataset.from_hugging_face(
    name="stanfordnlp/imdb",
    split="train",
    text_col="text",
    label_col="label",
    is_regression=False,
    num_examples=1000,
    seed=42
)

# Fetch ESMs and rank tasks
task_ranking = compute_task_ranking(
    dataset=dataset,
    model_name="bert-base-multilingual-uncased"
)

# Display top 5 recommendations
print(task_ranking[:5])

For more information on how to use ESMs please have a look at the official Github repository.

Citation

If you are using this Embedding Space Maps, please cite our paper.

BibTeX:

@misc{schulte2024moreparameterefficientselectionintermediate,
      title={Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning}, 
      author={David Schulte and Felix Hamborg and Alan Akbik},
      year={2024},
      eprint={2410.15148},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2410.15148}, 
}

APA:

Schulte, D., Hamborg, F., & Akbik, A. (2024). Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning. arXiv preprint arXiv:2410.15148.

davidschulte
/

ESM_joelniklaus__lextreme_swiss_judgment_prediction