File size: 7,172 Bytes
0f17119 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 |
Adding Tasks
####################################
This is a tutorial on adding new machine learning tasks using ``lavis.tasks`` module.
The LAVIS library includes a standard task module that centralizes the model training and evaluation procedure of machine learning tasks.
The ``lavis.tasks`` module is designed such that any new tasks can be added and integrated, catering to any customization in the training and testing procedures.
In this tutorial, we will replicate the steps to add a new task into LAVIS for the `video-grounded dialogue tasks <https://arxiv.org/pdf/1901.09107.pdf>`_.
Base Task ``lavis.tasks.base_task``
********************************************************************************
Note that any new model definition should inherit the base task class ``BaseTask``:
.. code-block:: python
import logging
import os
import torch.distributed as dist
from lavis.common.dist_utils import get_rank, get_world_size, is_main_process
from lavis.common.logger import MetricLogger, SmoothedValue
from lavis.common.registry import registry
from lavis.datasets.data_utils import prepare_sample
class BaseTask:
def __init__(self, **kwargs):
super().__init__()
self.inst_id_key = "instance_id"
@classmethod
def setup_task(cls, **kwargs):
return cls()
def build_model(self, cfg):
model_config = cfg.model_cfg
model_cls = registry.get_model_class(model_config.arch)
return model_cls.from_config(model_config)
def build_datasets(self, cfg):
"""
Build a dictionary of datasets, keyed by split 'train', 'valid', 'test'.
Download dataset and annotations automatically if not exist.
Args:
cfg (common.config.Config): _description_
Returns:
dict: Dictionary of torch.utils.data.Dataset objects by split.
"""
datasets = dict()
datasets_config = cfg.datasets_cfg
assert len(datasets_config) > 0, "At least one dataset has to be specified."
for name in datasets_config:
dataset_config = datasets_config[name]
builder = registry.get_builder_class(name)(dataset_config)
dataset = builder.build_datasets()
datasets[name] = dataset
return datasets
def train_step(self, model, samples):
loss = model(samples)["loss"]
return loss
...
In this base task, we already declare and standardize many common methods such as ``train_step``, ``build_model``, and ``build_datasets``.
Inheriting this base task class allows us to standardize operations of tasks across all task classes.
We recommend users not change the implementation of the base task class as this will have an impact on all existing task subclasses.
Dialogue Task ``lavis.tasks.dialogue``
********************************************************************************
In this step, we can define a new task class, e.g. under ``lavis.tasks.dialogue``, for video-grounded dialogues.
For instance, we define a new task class ``DialogueTask`` that inherits the super task class ``BaseTask``.
.. code-block:: python
import json
import os
from lavis.common.dist_utils import main_process
from lavis.common.logger import MetricLogger
from lavis.common.registry import registry
from lavis.tasks.base_task import BaseTask
from lavis.datasets.data_utils import prepare_sample
import numpy as np
@registry.register_task("dialogue")
class DialogueTask(BaseTask):
def __init__(self, num_beams, max_len, min_len, evaluate, report_metric=True):
super().__init__()
self.num_beams = num_beams
self.max_len = max_len
self.min_len = min_len
self.evaluate = evaluate
self.report_metric = report_metric
@classmethod
def setup_task(cls, cfg):
run_cfg = cfg.run_cfg
num_beams = run_cfg.num_beams
max_len = run_cfg.max_len
min_len = run_cfg.min_len
evaluate = run_cfg.evaluate
report_metric = run_cfg.get("report_metric", True)
return cls(
num_beams=num_beams,
max_len=max_len,
min_len=min_len,
evaluate=evaluate,
report_metric=report_metric,
)
def valid_step(self, model, samples):
results = []
loss = model(samples)["loss"].item()
return [loss]
...
Note that for any new task, we advise the users to review carefully the functions implemented within ``BaseTask`` and consider which methods should be modified.
For instance, the base task class already contains a standard implementation of model training steps that are common among machine learning steps.
Some major methods we want to emphasize and should be customized by each task are the ``valid_step`` and ``evaluation``.
These operations were not fully implemented in the base task class due to the differences in evaluation procedures among many machine learning tasks.
Another method that should be considered is the ``setup_task`` method.
This method will receive configurations that set task-specific parameters to initialize any task instance.
Registering New Task ``lavis.tasks.__init__``
********************************************************************************
Any new task must be officially registered as part of the ``lavis.tasks`` module. For instance, to add a new task for video-grounded dialogues, we can modify the ``__init__.py`` as follows:
.. code-block:: python
from lavis.tasks.dialogue import DialogueTask
...
__all__ = [
...
"DialogueTask"
]
Assigning Task
***************
From the above example of task class, note that we define a ``setup_task`` method for each task class.
This method will process a configuration file and pass specific parameters e.g. ``num_beams`` (for beam search generative tasks during the inference stage), to initialize the task classes properly.
To assign and associate any task, we need to specify the correct registry of task classes in a configuration file.
For instance, the following should be specified in a configuration file e.g. ``dialogue_avsd_ft.yaml``:
.. code-block:: yaml
run:
task: dialogue # name of the task
# optimizer
...
max_len: 20
min_len: 5
num_beams: 3
...
Subsequently, any processes (e.g. training) should load this configuration file to assign the correct task.
.. code-block:: sh
python train.py --cfg-path dialogue_avsd_ft.yaml |