Cognitive Graph for Multi-Hop Reading Comprehension at Scale

  • 来源:ACL 2019
  • 机构:清华 唐杰 + 阿里
  • 任务:多跳 MRC(hotpot QA)
  • 动机:分成 system1 (隐式抽取)和 system2(显式推断)
  • 先说说 system1 和 system2:之前在唐建的 talk 里面听说过这个概念,当时他用贝叶斯那套做 logical reasoning,没听懂,这篇文章才终于搞明白这个故事。真是一个好故事啊!our brains first retrieve relevant information following attention via an implicit, unconscious and intuitive process called System 1, based on which another explicit, conscious and controllable reasoning process, System 2, is then conducted. System 1 could provide resources according to requests, while System 2 enables diving deeper into relational information by performing sequential thinking in the working memory, which is slower but with human-unique rationality. 从这段话上乍一看, sys1 与 sys2 不就正好是我 DocRE 四部曲分类中的二三两步嘛!
  • 方法:可是他的做法却不是这样(和我那个四部曲不一样,但是和他 sys1 sys2 的故事还是挺相符的),整体是一个迭代的方法。System 1 extracts question-relevant entities and answer candidates from paragraphs and encodes their semantic information. System 2 then conducts the reasoning procedure over the graph, and collects clues to guide System 1 to better extract next-hop entities.
    • sys1 是从文章中提取 span 作为 next 节点,提取 [CLS] Question [SEP ] clues[x, G] [SEP ] Para[x] 整句的表示作为 semantic 表示(其中 param 是通过 question + clue 得到的,clue 就是 x 出现的句子(所以我猜 param 就是 x 出现的段?))
    • sys2 是将 next 节点的信息与 semantic 信息融合(以类似 GCN 信息聚合的方式,没仔细看),最终得到 answer 节点的表示(这里也没有看明白,怎么判断节点就是 answer 节点?)
  • 实验:粗略的看,只有 bert 是 31.6,+sys1 是 45,+sys2 是 49,所以其实大家做 DocRE 不搞显式推断也是合理的?