--- language: "en" tags: - dstc10 - knowledge title-body validation widget: - text: "Can you accommodate large groups? It does not offer free WiFi." - text: "Is there a gym on site? It does not have an onsite fitness center." --- This is the model used for knowledge clustering where we feed title-body pair and the classifier predicts if the pair is valid or not. For further information, please refer to https://github.com/yctam/dstc10_track2_task2 for the Github repository. Credit: Jiakai Zou, Wilson Tam --- ```python from transformers import AutoConfig, AutoTokenizer, AutoModelForSequenceClassification def single_test(tokenizer, title_body_pair): result = tokenizer([title_body_pair], return_tensors="pt") model.eval() outputs = model(**result) predictions = outputs.logits.argmax(dim=-1) # There was a mistake in flipping the labels. return True if predictions == 0 else False if __name__ == '__main__': model_name = "wilsontam/bert-base-uncased-dstc10-kb-title-body-validate" config = AutoConfig.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=True) model = AutoModelForSequenceClassification.from_pretrained(".") sentence = "Can I check in anytime?" body = "Yes, 24 Hours Front Desk Avaliable." print(single_test((sentence, body))) # Expect: True ```