README.md · osiria/diablo-italian-base-354m at c12b39a038e10f4e49b63d5a64d30a031ef32ffe

metadata

license: mit
language:
  - it

Model: DIABLO 🔥
Lang: IT

Model description

This model is a causal language model for the Italian language, based on a GPT-like [1] architecture (more specifically, the model has been obtained by modifying Meta's XGLM architecture [2] and exploiting its 564M checkpoint).

It is a foundation model, pre-trained for causal language modeling, so it is mainly suitable for basic natural language generation, and you will have to fine-tune it in order to use it on more specific downstream tasks.

Quick usage

In order to use the model for inference, the following pipeline is needed:

Limitations

The model might behave erratically when presented with prompts which are too far away from its pre-training and, because of the probabilistic nature of its generation, it might occasionally produce biased or offensive content with respect to gender, race, ideologies, and political or religious beliefs. These limitations imply that the model and its outputs should be used with caution, and should not be involved in situations that require the generated text to be fair or true.

References

[1] https://arxiv.org/abs/2005.14165

[2] https://arxiv.org/abs/2112.10668

License

The model is released under MIT license