As you can see we have 2 unmatched entities: "January 18" and "U.S". The first one is a hallucinated entity in the summary, that does not exist in the article. | |
Deep learning based generation is [prone to hallucinate](https://arxiv.org/pdf/2202.03629.pdf) unintended text. These hallucinations degrade | |
system performance and fail to meet user expectations in many real-world scenarios. By applying entity matching, we can improve this problem | |
for the downstream task of summary generation. U.S. **does** occur in the article, but as "US" instead of "U.S.". This could be solved | |
by comparing to a list of abbreviations or with a specific embedder for abbreviations but is currently not implemented. |