Transformers documentation
Utilizing the @auto_docstring Decorator
Utilizing the @auto_docstring Decorator
The @auto_docstring
decorator in the Hugging Face Transformers library helps generate docstrings for model classes and their methods, which will be used to build the documentation for the library. It aims to improve consistency and reduce boilerplate by automatically including standard argument descriptions and allowing for targeted overrides and additions.
📜 How it Works
The @auto_docstring
decorator constructs docstrings by:
- Signature Inspection: It inspects the signature (arguments, types, defaults) of the decorated class’s
__init__
method or the decorated function. - Centralized Docstring Fetching: It retrieves predefined docstrings for common arguments (e.g.,
input_ids
,attention_mask
) from internal library sources (likeModelArgs
orImageProcessorArgs
inutils/args_doc.py
). - Overriding or Adding Arguments Descriptions:
- Direct Docstring Block: It incorporates custom docstring content from an
r""" """
(or""" """
) block below the method signature or within the__init__
docstring. This is for documenting new arguments or overriding standard descriptions. - Decorator Arguments (
custom_args
): Acustom_args
docstring block can be passed to the decorator to provide docstrings for specific arguments directly in the decorator call. This can be used to define the docstring block for new arguments once if they are repeated in multiple places in the modeling file.
- Direct Docstring Block: It incorporates custom docstring content from an
- Adding Classes and Functions Introduction:
custom_intro
argument: Allows prepending a custom introductory paragraph to a class or function docstring.- Automatic Introduction Generation: For model classes with standard naming patterns (like
ModelForCausalLM
) or belonging to a pipeline, the decorator automatically generates an appropriate introductory paragraph usingClassDocstring
inutils/args_doc.py
as the source.
- Templating: The decorator uses a templating system, allowing predefined docstrings to include dynamic information deduced from the
auto_modules
of the library, such as{{processor_class}}
or{{config_class}}
. - Deducing Relevant Examples: The decorator attempts to find appropriate usage examples based on the model’s task or pipeline compatibility. It extracts checkpoint information from the model’s configuration class to provide concrete examples with real model identifiers.
- Adding Return Value Documentation: For methods like
forward
, the decorator can automatically generate the “Returns” section based on the method’s return type annotation. For example, for a method returning aModelOutput
subclass, it will extracts field descriptions from that class’s docstring to create a comprehensive return value description. A customReturns
section can also be manually specified in the function docstring block. - Unrolling Kwargs Typed With Unpack Operator: For specific methods (defined in
UNROLL_KWARGS_METHODS
) or classes (defined inUNROLL_KWARGS_CLASSES
), the decorator processes**kwargs
parameters that are typed withUnpack[KwargsTypedDict]
. It extracts the documentation from the TypedDict and adds each parameter to the function’s docstring. Currently, this functionality is only supported forFastImageProcessorKwargs
.
🚀 How to Use @auto_docstring
1. Importing the Decorator
Import the decorator into your modeling file:
from ...utils import auto_docstring
2. Applying to Classes
Place @auto_docstring
directly above the class definition. It uses the __init__
method’s signature and its docstring for parameter descriptions.
from transformers.modeling_utils import PreTrainedModel
from ...utils import auto_docstring
@auto_docstring
class MyAwesomeModel(PreTrainedModel):
def __init__(self, config, custom_parameter: int = 10, another_custom_arg: str = "default"):
r"""
custom_parameter (`int`, *optional*, defaults to 10):
Description of the custom_parameter for MyAwesomeModel.
another_custom_arg (`str`, *optional*, defaults to "default"):
Documentation for another unique argument.
"""
super().__init__(config)
self.custom_parameter = custom_parameter
self.another_custom_arg = another_custom_arg
# ... rest of your init
# ... other methods
Advanced Class Decoration:
Arguments can be passed directly to @auto_docstring
for more control:
@auto_docstring(
custom_intro="""This model performs specific synergistic operations.
It builds upon the standard Transformer architecture with unique modifications.""",
custom_args="""
custom_parameter (`type`, *optional*, defaults to `default_value`):
A concise description for custom_parameter if not defined or overriding the description in `args_doc.py`.
internal_helper_arg (`type`, *optional*, defaults to `default_value`):
A concise description for internal_helper_arg if not defined or overriding the description in `args_doc.py`.
"""
)
class MySpecialModel(PreTrainedModel):
def __init__(self, config: ConfigType, custom_parameter: "type" = "default_value", internal_helper_arg=None):
# ...
Or:
@auto_docstring(
custom_intro="""This model performs specific synergistic operations.
It builds upon the standard Transformer architecture with unique modifications.""",
)
class MySpecialModel(PreTrainedModel):
def __init__(self, config: ConfigType, custom_parameter: "type" = "default_value", internal_helper_arg=None):
r"""
custom_parameter (`type`, *optional*, defaults to `default_value`):
A concise description for custom_parameter if not defined or overriding the description in `args_doc.py`.
internal_helper_arg (`type`, *optional*, defaults to `default_value`):
A concise description for internal_helper_arg if not defined or overriding the description in `args_doc.py`.
"""
# ...
3. Applying to Functions (e.g., forward method)
Apply the decorator above method definitions, such as the forward
method.
@auto_docstring
def forward(
self,
input_ids: Optional[torch.Tensor] = None,
attention_mask: Optional[torch.Tensor] = None,
new_custom_argument: Optional[torch.Tensor] = None,
arg_documented_in_args_doc: Optional[torch.Tensor] = None,
# ... other arguments
) -> Union[Tuple, ModelOutput]: # The description of the return value will automatically be generated from the ModelOutput class docstring.
r"""
new_custom_argument (`torch.Tensor`, *optional*):
Description of this new custom argument and its expected shape or type.
"""
# ...
Advanced Function Decoration:
Arguments can be passed directly to @auto_docstring
for more control. Returns
and Examples
sections can also be manually specified:
MODEL_COMMON_CUSTOM_ARGS = r"""
common_arg_1 (`torch.Tensor`, *optional*, defaults to `default_value`):
Description of common_arg_1
common_arg_2 (`torch.Tensor`, *optional*, defaults to `default_value`):
Description of common_arg_2
...
"""
class MyModel(PreTrainedModel):
# ...
@auto_docstring(
custom_intro="""
This is a custom introduction for the function.
"""
custom_args=MODEL_COMMON_CUSTOM_ARGS
)
def forward(
self,
input_ids: Optional[torch.Tensor] = None,
attention_mask: Optional[torch.Tensor] = None,
common_arg_1: Optional[torch.Tensor] = None,
common_arg_2: Optional[torch.Tensor] = None,
#...
function_specific_argument: Optional[torch.Tensor] = None,
# ... other arguments
) -> torch.Tensor:
r"""
function_specific_argument (`torch.Tensor`, *optional*):
Description of an argument specific to this function
Returns:
`torch.Tensor`: For a function returning a generic type, a custom "Returns" section can be specified.
Example:
(To override the default example with a custom one or to add an example for a model class that does not have a pipeline)
```python
...
```
"""
# ...
✍️ Documenting Arguments: Approach & Priority
Standard Arguments (e.g.,
input_ids
,attention_mask
,pixel_values
,encoder_hidden_states
etc.):@auto_docstring
retrieves descriptions from a central source. Do not redefine these locally if their description and shape are the same as inargs_doc.py
.
New or Custom Arguments:
- Primary Method: Document these within an
r""" """
docstring block following the signature (for functions) or in the__init__
method’s docstring (for class parameters). - Format:
argument_name (`type`, *optional*, defaults to `X`): Description of the argument. Explain its purpose, expected shape/type if complex, and default behavior. This can span multiple lines.
- Include
type
in backticks. - Add ”optional” if the argument is not required (has a default value).
- Add “defaults to
X
” if it has a default value (no need to specify “defaults toNone
” if the default value isNone
).
- Primary Method: Document these within an
Overriding Standard Arguments:
- If a standard argument behaves differently (e.g., different expected shape, model-specific behavior), provide its complete description in the local
r""" """
docstring. This local definition takes precedence. - The
labels
argument is often customized per model and typically requires a specific docstring.
- If a standard argument behaves differently (e.g., different expected shape, model-specific behavior), provide its complete description in the local
Using Decorator Arguments for Overrides or New Arguments (
custom_args
):- New or custom arguments docstrings can also be passed to
@auto_docstring
as acustom_args
argument. This can be used to define the docstring block for new arguments once if they are repeated in multiple places in the modeling file.
- New or custom arguments docstrings can also be passed to
Usage with modular files
When working with modular files, follow these guidelines for applying the @auto_docstring
decorator:
For standalone models in modular files: Apply the
@auto_docstring
decorator just as you would in regular modeling files.For models inheriting from other library models:
- When inheriting from a parent model, decorators (including
@auto_docstring
) are automatically carried over to the generated modeling file without needing to add them in your modular file. - If you need to modify the
@auto_docstring
behavior, apply the customized decorator in your modular file, making sure to include all other decorators that were present on the original function/class.
Warning: When overriding any decorator in a modular file, you must include ALL decorators that were applied to that function/class in the parent model. If you only override some decorators, the others won’t be included in the generated modeling file.
- When inheriting from a parent model, decorators (including
Note: The check_auto_docstrings
tool doesn’t check modular files directly, but it will check (and modify when using --fix_and_overwrite
) the generated modeling files. If issues are found in the generated files, you’ll need to update your modular files accordingly.
✅ Checking Your Docstrings with check_auto_docstrings
The library includes a utility script to validate docstrings. This check is typically run during Continuous Integration (CI).
What it Checks:
- Decorator Presence: Ensures
@auto_docstring
is applied to relevant model classes and public methods. (TODO) - Argument Completeness & Consistency:
- Flags arguments in the signature that are not known standard arguments and lack a local description.
- Ensures documented arguments exist in the signature. (TODO)
- Verifies that types and default values in the docstring match the signature. (TODO)
- Placeholder Detection: Reminds you to complete placeholders like
<fill_type>
or<fill_docstring>
. - Formatting: Adherence to the expected docstring style.
Running the Check Locally:
Run this check locally before committing. The common command is:
make fix-copies
Alternatively, to only perform docstrings and auto-docstring checks, you can use:
python utils/check_docstrings.py # to only check files included in the diff without fixing them
# Or: python utils/check_docstrings.py --fix_and_overwrite # to fix and overwrite the files in the diff
# Or: python utils/check_docstrings.py --fix_and_overwrite --check_all # to fix and overwrite all files
Workflow with the Checker:
- Add
@auto_docstring(...)
to the class or method. - For new, custom, or overridden arguments, add descriptions in an
r""" """
block. - Run
make fix-copies
(or thecheck_docstrings.py
utility).- For unrecognized arguments lacking documentation, the utility will create placeholder entries.
- Manually edit these placeholders with accurate types and descriptions.
- Re-run the check to ensure all issues are resolved.
🔑 Key Takeaways & Best Practices
- Use
@auto_docstring
for new PyTorch model classes (PreTrainedModel
subclasses) and their primary for methods (e.g.,forward
,get_text_features
etc.). - For classes, the
__init__
method’s docstring is the main source for parameter descriptions when using@auto_docstring
on the class. - Rely on standard docstrings; do not redefine common arguments unless their behavior is different in your specific model.
- Document new or custom arguments clearly.
- Run
check_docstrings
locally and iteratively.
By following these guidelines, you help maintain consistent and informative documentation for the Hugging Face Transformers library 🤗.
< > Update on GitHub