Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs

  • 来源:ACL 2019
  • 机构:京东
  • 任务:跨文档 MRC wikihop HotpotQA
  • 方法:先从 candidate 和 query 里面抽实体,然后在 candidate & query,query & document,query & entity 之间做 co-attention,然后用 candidate node,doc node,entity node 建图,规则连图,RGCN 传播,最后用 candidate node 与其相关的 entity node 相加,分类求答案。如右图所示
  • 感觉 DocRE 都是从这篇文章抄的
  • 这里 co-attention 值得说一下,这篇文章是用 attention 得到一个 attention score 矩阵,然后用这个矩阵直接与两边的内容相乘,就是相当于将自己变成对方的长度(?)而不 aggregate


    (当然,有一部分是 aggregate 的)
    然后后面 aggregate 成一个 token 长度的,是通过句子做 self-attention 得到。我不理解为什么要这么麻烦,直接一次 attn-agg 不好吗?现在这样做有什么优点呢?
  • 关于连边,感觉也是比较有意思,可能是因为有三方了吧,这个以后实际设计模型的时候待参考
    1. an edge between a document node and a candidate node if the candidate appear in the document at least one time.
    2. an edge between a document node and an entity node if the entity is extracted from the document.
    3. an edge between a candidate node and an entity node if the entity is a mention of the candidate.
    4. an edge between two entity nodes if they are extracted from the same document.
    5. an edge between two entity nodes if they are mentions of the same candidate or query subject and they are extracted from different documents.
    6. all candidate nodes connect with each other.
    7. entity nodes that do not meet previous conditions are connected.
  • 另外,这里的 GCN 有一个 gate 机制

    一直想探寻清楚,这种显示的 gate 机制与那种隐式的残差链接,有什么区别呢?有什么优劣呢?

发表评论