--- license: mit datasets: - squad language: - en pipeline_tag: text-classification widget: - text: "question: What number comes after five? answer: four" - text: "question: Which person is associated with Kanye West? answer: a tree" - text: "question: When is US independence day from aliens? answer: 7/4/1996" --- # kgourgou/bert-base-uncased-QA-classification An experiment into classifying whether a pair of (question, answer) is valid. This is not a very good model at this point, but eventually such a model could help with RAG. For a stronger model, check this one by [vectara](https://huggingface.co/vectara/hallucination_evaluation_model). Input must be formatted as ``` question: {your query}? answer: {your possible answer} ``` The output probabilities are for 1. class 0 = the answer string couldn't be an answer to the question and 2. class 1 = the answer string could be an answer to the question. "Could be" should be interpreted as a type match, e.g., if the question requires the answer to be a person or a number or a date. Examples: - "question: What number comes after five? answer: four" → this should be class 1 as the answer is a number (even if it's not the right number). - "question: Which person is associated with Kanye West? answer: a tree" → this should be class 0 as a tree is not a person. ## Base model details The base model is bert-base-uncased. For this experiment, I only use the "squad" dataset after preprocessing it to bring it to the required format.