gigilin7's picture
Update README.md
7941c15
metadata
language: zh
tags:
  - agriculture-domain
  - agriculture
widget:
  - text: '[MASK]是許多亞洲國家的主要糧食作物。'

agriculture-bert-base-chinese

This is a bert model for agriculture domain. The self-supervised learning approach of MLM was used to train the model.

  • Masked language modeling (MLM): taking a sentence, the model randomly masks 15% of the words in the input then run the entire masked sentence through the model and has to predict the masked words.
  • This is different from traditional recurrent neural networks (RNNs) that usually see the words one after the other, or from autoregressive models like GPT internally masks the future tokens.
  • It allows the model to learn a bidirectional representation of the sentence.
from transformers import pipeline
fill_mask = pipeline(
    "fill-mask",
    model="gigilin7/agriculture-bert-base-chinese",
    tokenizer="gigilin7/agriculture-bert-base-chinese"
)
res = fill_mask("[MASK]是許多亞洲國家的主要糧食作物。")