You are viewing main version, which requires installation from source. If you'd like
regular pip install, checkout the latest stable version (v0.4.0).
Loading methods
Methods for listing and loading evaluation modules:
List
evaluate.list_evaluation_modules
< source >( module_type = None include_community = True with_details = False )
Parameters
- module_type (
str
, optional, defaults toNone
) — Type of evaluation modules to list. Has to be one of'metric'
,'comparison'
, or'measurement'
. IfNone
, all types are listed. - include_community (
bool
, optional, defaults toTrue
) — Include community modules in the list. - with_details (
bool
, optional, defaults toFalse
) — Return the full details on the metrics instead of only the ID.
List all evaluation modules available on the Hugging Face Hub.
Load
evaluate.load
< source >( path: str config_name: Optional = None module_type: Optional = None process_id: int = 0 num_process: int = 1 cache_dir: Optional = None experiment_id: Optional = None keep_in_memory: bool = False download_config: Optional = None download_mode: Optional = None revision: Union = None **init_kwargs )
Parameters
- path (
str
) — Path to the evaluation processing script with the evaluation builder. Can be either:- a local path to processing script or the directory containing the script (if the script has the same name as the directory),
e.g.
'./metrics/rouge'
or'./metrics/rouge/rouge.py'
- a evaluation module identifier on the HuggingFace evaluate repo e.g.
'rouge'
or'bleu'
that are in either'metrics/'
,'comparisons/'
, or'measurements/'
depending on the providedmodule_type
- a local path to processing script or the directory containing the script (if the script has the same name as the directory),
e.g.
- config_name (
str
, optional) — Selecting a configuration for the metric (e.g. the GLUE metric has a configuration for each subset). - module_type (
str
, default'metric'
) — Type of evaluation module, can be one of'metric'
,'comparison'
, or'measurement'
. - process_id (
int
, optional) — For distributed evaluation: id of the process. - num_process (
int
, optional) — For distributed evaluation: total number of processes. - cache_dir (
str
, optional) — Path to store the temporary predictions and references (default to~/.cache/huggingface/evaluate/
). - experiment_id (
str
) — A specific experiment id. This is used if several distributed evaluations share the same file system. This is useful to compute metrics in distributed setups (in particular non-additive metrics like F1). - keep_in_memory (
bool
) — Whether to store the temporary results in memory (defaults toFalse
). - download_config (
~evaluate.DownloadConfig
, optional) — Specific download configuration parameters. - download_mode (
DownloadMode
, defaults toREUSE_DATASET_IF_EXISTS
) — Download/generate mode. - revision (
Union[str, evaluate.Version]
, optional) — If specified, the module will be loaded from the datasets repository at this version. By default it is set to the local version of the lib. Specifying a version that is different from your local version of the lib might cause compatibility issues.
Load a EvaluationModule.