Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation
报告题目：Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation
报 告 人：孙飞
报告人简介：孙飞，毕业于中科院计算所，目前在阿里巴巴达摩院智能计算实验室从事推荐系统、自然语言处理等领域研发工作。主要研究方向：推荐系统中用户行为序列表示学习、隐私保护；文本表示学习、文本摘要等。在ACL、SIGIR、WWW、TOIS等顶级会议期刊发表论文40余篇，获RecSys 2019 会议Best Long Paper Runner-up奖，Google学术引用1000余次。担任ACL、SIGIR、WWW、IJCAI等国际会议PC member及senior PC。
摘要：Despite significant progress has been achieved in text summarization, factual inconsistency in generated summaries still severely limits its practical applications. Among the key factors to ensure factual consistency, a reliable automatic evaluation metric is the first and the most crucial one. However, existing metrics either neglect the intrinsic cause of the factual inconsistency or rely on auxiliary tasks, leading to an unsatisfied correlation with human judgments or increasing the inconvenience of usage in practice. In light of these challenges, we propose a novel metric to evaluate the factual consistency in text summarization via counterfactual estimation, which formulates the causal relationship among the source document, the generated summary, and the language prior. We remove the effect of language prior, which can cause factual inconsistency, from the total causal effect on the generated summary, and provides a simple yet effective way to evaluate consistency without relying on other auxiliary tasks. We conduct a series of experiments on three public abstractive text summarization datasets, and demonstrate the advantages of the proposed metric in both improving the correlation with human judgments and the convenience of usage.