davidschulte
/

ESM_joelniklaus__lextreme_swiss_judgment_prediction

Safetensors

embedding_space_map

BaseLM:bert-base-multilingual-uncased

Model card Files Files and versions Community

davidschulte commited on 13 days ago

Commit

3e6e3bd

•

1 Parent(s): b5b77ba

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +142 -5

README.md CHANGED Viewed

@@ -1,9 +1,146 @@
 ---
 tags:
-- model_hub_mixin
-- pytorch_model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Library: [More Information Needed]
-- Docs: [More Information Needed]

 ---
+base_model: bert-base-multilingual-uncased
+datasets:
+- joelniklaus/lextreme
+license: apache-2.0
 tags:
+- embedding_space_map
+- BaseLM:bert-base-multilingual-uncased
 ---
+# ESM joelniklaus/lextreme
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+ESM
+- **Developed by:** David Schulte
+- **Model type:** ESM
+- **Base Model:** bert-base-multilingual-uncased
+- **Intermediate Task:** joelniklaus/lextreme
+- **ESM architecture:** linear
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** Apache-2.0 license
+## Training Details
+### Intermediate Task
+- **Task ID:** joelniklaus/lextreme
+- **Subset [optional]:** swiss_judgment_prediction
+- **Text Column:** input
+- **Label Column:** label
+- **Dataset Split:**  train
+- **Sample size [optional]:** 10000
+- **Sample seed [optional]:** 42
+### Training Procedure [optional]
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Language Model Training Hyperparameters [optional]
+- **Epochs:** 3
+- **Batch size:** 32
+- **Learning rate:** 2e-05
+- **Weight Decay:** 0.01
+- **Optimizer**: AdamW
+### ESM Training Hyperparameters [optional]
+- **Epochs:** 10
+- **Batch size:** 32
+- **Learning rate:** 0.001
+- **Weight Decay:** 0.01
+- **Optimizer**: AdamW
+### Additional trainiung details [optional]
+## Model evaluation
+### Evaluation of fine-tuned language model [optional]
+### Evaluation of ESM [optional]
+MSE:
+### Additional evaluation details [optional]
+## What are Embedding Space Maps?
+<!-- This section describes the evaluation protocols and provides the results. -->
+Embedding Space Maps (ESMs) are neural networks that approximate the effect of fine-tuning a language model on a task. They can be used to quickly transform embeddings from a base model to approximate how a fine-tuned model would embed the the input text.
+ESMs can be used for intermediate task selection with the ESM-LogME workflow.
+## How can I use Embedding Space Maps for Intermediate Task Selection?
+[![PyPI version](https://img.shields.io/pypi/v/hf-dataset-selector.svg)](https://pypi.org/project/hf-dataset-selector)
+We release **hf-dataset-selector**, a Python package for intermediate task selection using Embedding Space Maps.
+**hf-dataset-selector** fetches ESMs for a given language model and uses it to find the best dataset for applying intermediate training to the target task. ESMs are found by their tags on the Huggingface Hub.
+```python
+from hfselect import Dataset, compute_task_ranking
+# Load target dataset from the Hugging Face Hub
+dataset = Dataset.from_hugging_face(
+    name="stanfordnlp/imdb",
+    split="train",
+    text_col="text",
+    label_col="label",
+    is_regression=False,
+    num_examples=1000,
+    seed=42
+)
+# Fetch ESMs and rank tasks
+task_ranking = compute_task_ranking(
+    dataset=dataset,
+    model_name="bert-base-multilingual-uncased"
+)
+# Display top 5 recommendations
+print(task_ranking[:5])
+```
+For more information on how to use ESMs please have a look at the [official Github repository](https://github.com/davidschulte/hf-dataset-selector).
+## Citation
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+If you are using this Embedding Space Maps, please cite our [paper](https://arxiv.org/abs/2410.15148).
+**BibTeX:**
+```
+@misc{schulte2024moreparameterefficientselectionintermediate,
+      title={Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning},
+      author={David Schulte and Felix Hamborg and Alan Akbik},
+      year={2024},
+      eprint={2410.15148},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2410.15148},
+}
+```
+**APA:**
+```
+Schulte, D., Hamborg, F., & Akbik, A. (2024). Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning. arXiv preprint arXiv:2410.15148.
+```
+## Additional Information