- 🐷 := fundamental theories
- 👨👩👧👦 := series of papers on a same topic
- 🐶 := a single paper

## 1 **Maths**

### 1.1 Linear Algebra & Functional Analysis

### 1.2 Probability & Statistics

### 1.3 Optimization & Numerical Computation

## 2 **ML**

### 2.1 Fundamentals

### 2.2 Deep Clustering & Subspace Clustering

- 🐶 ICML 2016, DEC, Unsupervised deep embedding for clustering analysis.
- 🐶 CVPR 2016, JULE, Joint Unsupervised Learning of Deep Representations and Image Clusters
- 👨👩👧👦 AAAI 2020, CGCN, Collaborative Graph Convolutional Networks: Unsupervised Learning Meets Semi-Supervised Learning
- including: VAE, VGAE, GMVAE, VaDE, DAEGC, SDCN, AGC, DGG, CGCN

- 👨👩👧👦 Sparse self representation subspace clustering
- including: SSC, SSSC, SSConvSCN

- 🐶 Low rank representation subspace clustering
- math fundamental: 🐷 low rank matrix factorization

### 2.3 Spectral Methods & GNN

- 🐷 Spectral Clustering
- 🐷 Spectra of Simple Graphs
- 🐷 Graph Fourier Transformation & convolution
- 🐶 ICML 2017, GCN, Semi-Supervised Classification with Graph Convolutional Network
- 🐶 ICML 2019, SGC, Simplifying Graph Convolutional Networks
- 🐶 CVPR 2019, IGCN, Label Efficient Semi-Supervised Learning via Graph Filtering
- 🐶 AAAI 2020, MADGap, Measuring and Relieving the Over-smoothing Problem for Graph Neural Networks from the Topological View
- 🐶 AAAI 2021, FAGCN, Beyond Low-frequency Information in Graph Convolutional Networks
- 🐶 arXiv 2021, Uniting Heterogeneity, Inductiveness, and Efficiency for Graph Representation Learning
- 🐶 arXiv 2021, How Attentive are Graph Attention Networks?

### 2.4 Few-shot Learning

#### 2.4.1 Metric Learning

- 🐶 ICML 2015, Siamese Net, Siamese neural networks for one-shot image recognition
- 🐶 NIPS 2016, Matching Net, Matching networks for one shot learning
- 🐶 NIPS 2017, Prototypical Net, Prototypical Networks for Few-shot Learning
- 🐶 CVPR 2018, Relation Net, Learning to Compare: Relation Network for Few-Shot Learning

#### 2.4.2 Meta Optimizer Based

- 🐶 ICLR 2017, meta-LSTM, Optimization as a Model for Few-Shot Learning
- 🐶 ICML 2017, MAML, Model-agnostic meta-learning for fast adaptation of deep networks

#### 2.4.3 Improvements

- 👨👩👧👦 Task-specific Few-shot learning
- 🐶 NAACL 2021, How Many Data Points is a Prompt Worth?
- 🐶 arXiv 2021, True Few-Shot Learning with Language Models
- 🐶 arXiv 2021, PTR: Prompt Tuning with Rules for Text Classification

### 2.5 Others

## 3 **NLP**

### 3.1 Fundamentals

- 🐷 Attention
- 👨👩👧👦 Pre-trained Models

### 3.2 Information Extraction

- 👨👩👧👦 Getting started with Information Extraction
- 🐶 KAP 2007, Fact Distribution in Information Extraction

#### 3.2.1 Sequence Labelling

- 🐷 BiLSTM+CRF
- 🐶 EMNLP 2019, BiLSTM-LAN, Hierarchically-Refined Label Attention Network for Sequence Labeling
- 🐶 ACL 2018, Marrying Up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding
- 🐶 ACL 2020, L-TapNet + CDT, Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network
- 🐶 ACL 2020, Low Resource Sequence Tagging using Sentence Reconstruction
- 🐶 arXiv 2021, Larger-Context Tagging: When and Why Does It Work?
- 🐶 ACL 2021, FEW-NERD: A Few-shot Named Entity Recognition Dataset
- 🐶 ACL 2021, Locate and Label: A Two-stage Identifier for Nested Named Entity Recognition

#### 3.2.2 Named Entity Recognition

- 🐶 ACL 2020, Simplify the Usage of Lexicon in Chinese NER
- 🐶 EMNLP 2020, StructShot, Simple and Effective Few-Shot Named Entity Recognition with Structured Nearest Neighbor Learning
- 🐶 ACL 2021, SPANNER: Named Entity Re-/Recognition as Span Prediction

#### 3.2.3 Relation Extraction

##### – Few-Shot RE

- 🐶 AAAI 2019, Proto-Hatt, Hybrid Attention-Based Prototypical Networks for Noisy Few-Shot Relation Classification
- 🐶 ACL 2019, MLMAN, Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification
- 🐶 COLING 2020, CTEG, Learning to Decouple Relations: Few-Shot Relation Classification with Entity-Guided Attention and Confusion-Aware Training
- 🐶 COLING 2020, MIML, Meta-Information Guided Meta-Learning for Few-Shot Relation Classification
- 🐶 COLING 2020, IncreProtoNet, A Two-phase Prototypical Network Model for Incremental Few-shot Relation Classification
- 🐶 ICLR 2021, COL, Prototypical Representation Learning for Relation Extraction
- 🐶 ISCOL 2021, Revisiting Few-shot Relation Classification: Evaluation Data and Classification Schemes
- 🐶 ACL 2021, Concept FERE, Entity Concept-enhanced Few-shot Relation Extraction

##### – Document RE

- 👨👩👧👦 Paper List
- 🐶 TACL 2017, Graph LSTM, Cross-Sentence N-ary Relation Extraction with Graph LSTMs
- 🐶 EMNLP 2018, Graph State LSTM, N-ary Relation Extraction using Graph State LSTM
- 🐶 ACL 2019, AGGCN, Attention Guided Graph Convolutional Networks for Relation Extraction
- 🐶 EMNLP 2019, EoG, Connecting the Dots: Document-level Neural Relation Extraction with Edge-oriented Graphs
- 🐶 NAACL 2019, Multiscale, Document-Level N-ary Relation Extraction with Multiscale Representation Learning
- 🐶 ACL 2020, LSR, Reasoning with Latent Structure Refinement for Document-Level Relation Extraction
- 🐶 EMNLP 2020, GAIN, Double Graph Based Reasoning for Document-level Relation Extraction
- 🐶 EMNLP 2020, GLRE, Global-to-Local Neural Networks for Document-Level Relation Extraction
- 🐶 COLING 2020, DHG, Document-level Relation Extraction with Dual-tier Heterogeneous Graph
- 🐶 COLING 2020, GEDA, Graph Enhanced Dual Attention Network for Document-Level Relation Extraction
- 🐶 COLING 2020, GCGCN, Global Context-enhanced Graph Convolutional Networks for Document-level Relation Extraction
- 🐶 ICKG 2020, Improving Document-level Relation Extraction via Contextualizing Mention Representations and Weighting Mention Pairs
- 🐶 arXiv 2020, CFER, Coarse-to-Fine Entity Representations for Document-level Relation Extraction
- 🐶 AAAI 2021, MIUK, Multi-view Inference for Relation Extraction with Uncertain Knowledge
- 🐶 AAAI 2021, ATLOP, Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling
- 🐶 AAAI 2021, DocRE-Rec, Document-Level Relation Extraction with Reconstruction
- 🐶 AAAI 2021, SSAN, Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction
- 🐶 arXiv 2021, MrGCN, MrGCN: Mirror Graph Convolution Network for Relation Extraction with Long-Term Dependencies
- 🐶 ACL 2021, Three Sentences Are All You Need — Local Path Enhanced Document Relation Extraction
- 🐶 ACL 2021 findings, SIRE: Separate Intra- and Inter-sentential Reasoning for Document-level Relation Extraction
- 🐶 ACL 2021 findings, Discriminative Reasoning for Document-level Relation Extraction
- 🐶 PAKDD 2021, Densely Connected Graph Attention Network Based on Iterative Path Reasoning for Document-Level Relation Extraction
- 🐶 PAKDD 2021, SaGCN: Structure-Aware Graph Convolution Network for Document-Level Relation Extraction
- 🐶 arXiv 2021, EIDER: Evidence-enhanced Document-level Relation Extraction

##### – Dialog RE

- 🐶 AAAI 2021, GDPNet, GDPNet: Refining Latent Multi-View Graph for Relation Extraction
- 🐶 arXiv 2020, DHGAT, Dialogue Relation Extraction with Document-level Heterogeneous Graph Attention Networks
- 🐶 arXiv 2020, SimpleRE, An Embarrassingly Simple Model for Dialogue Relation Extraction

##### – Others

- 🐶 AAAI 2020, CLC, Integrating Relation Constraints with Neural Relation Extractors
- 🐶 EMNLP 2020, Learning from Context or Names? An Empirical Study on Neural Relation Extraction
- 🐶 COLING 2020, C-GCN-MG, Graph Convolution over Multiple Dependency Sub-graphs for Relation Extraction
- 🐶 ISWC 2021, Unsupervised relation extraction using sentence encoding
- 🐶 arXiv 2021, A Question-answering Based Framework for Relation Extraction Validation
- 🐶 ICASSP 2021, Multi-Entity Collaborative Relation Extraction
- 🐶 NAACL 2021, Open Hierarchical Relation Extraction
- 🐶 ACL 2021, Dynamic Knowledge Graph Context Selection for Relation Extraction
- 🐶 arXiv 2021, Do Models Learn the Directionality of Relations? A New Evaluation Task: Relation Direction Recognition

#### 3.2.4 Entity Relation Extraction

- 🐶 ACL 2017, Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme
- 🐶 ACL 2018, Extracting Relational Facts by an End-to-End Neural Model with Copy Mechanism
- 🐶 AAAI 2019, Joint Extraction of Entities and Overlapping Relations Using Position-Attentive Sequence Labeling
- 🐶 ACL 2019, Joint Type Inference on Entities and Relations via Graph Convolutional Networks
- 🐶 ACL 2019, Entity-Relation Extraction as Multi-turn Question Answering
- 🐶 ACL 2020, CasREL, A Novel Cascade Binary Tagging Framework for Relational Triple Extraction
- 🐶 arXiv 2020, A Frustratingly Easy Approach for Joint Entity and Relation Extraction conference
- 🐶 COLING 2020, MPE, Bridging Text and Knowledge with Multi-Prototype Embedding for Few-Shot Relational Triple Extraction
- 🐶 EMNLP 2020, Two are Better than One: Joint Entity and Relation Extraction with Table-Sequence Encoders
- 🐶 EACL 2021, JEREX, An End-to-end Model for Entity-level Relation Extraction using Multi-instance Learning
- 🐶 EACL 2021, ENPAR: Enhancing Entity and Entity Pair Representations for Joint Entity Relation Extraction
- 🐶 arXiv 2021, RERE, Revisiting the Negative Data of Distantly Supervised Relation Extraction
- 🐶 ACL 2021, Injecting Knowledge Base Information into End-to-End Joint Entity and Relation Extraction and Coreference Resolution
- 🐶 ACL 2021, PRGC: Potential Relation and Global Correspondence Based Joint Relational Triple Extraction

#### 3.2.5 Event Extraction

- 👨👩👧👦 Survey of event extraction
- 🐶 ACL 2015, DMCNN, Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks
- 🐶 ACL 2016, JRNN, Joint Event Extraction via Recurrent Neural Networks
- 🐶 NLPCC 2016, CNN + BiLSTM, A Convolution BiLSTM Neural Network Model for Chinese Event Extraction
- 🐶 AAAI 2018, BiLSTM + CRF + ILP + CVT, Scale Up Event Extraction Learning via Automatic Training Data Generation
- 🐶 NAACL 2019, Adv-ED, Adversarial Training for Weakly Supervised Event Detection
- 🐶 arXiv 2020, L-HGAT, Label Enhanced Event Detection with Heterogeneous Graph Attention Networks
- 🐶 ICASSP 2021, Improving Event Detection by Exploiting Label Hierarchy

#### 3.2.6 Intent Detection

- 🐶 EMNLP 2020, DNNC, Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference

### 3.3 Representation Learning in NLP

- 🐶 TACL 2018, Learning Structured Text Representations
- 🐶 NIPS 2020, Language Through a Prism: A Spectral Approach for Multiscale Language Representations
- 🐶 arXiv 2021, WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach

### 3.4 Knowledge Graph

- 🐶 NIPS 2013, TransE, Translating Embeddings for Modeling Multi-relational Data
- 🐶 ACL 2020, NMN, Neighborhood Matching Network for Entity Alignment
- 🐶 ICLR 2020, CompGCN, Composition-based Multi-Relational Graph Convolutional Networks

#### 3.4.1 Reasoning on Graph

- 🐶 AAAI 2018, Variational Reasoning for Question Answering with Knowledge Graph
- 🐶 ACL 2019, Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs
- 🐶 ACL 2019, Cognitive Graph for Multi-Hop Reading Comprehension at Scale
- 🐶 ICML 2020, GraIL, Inductive Relation Prediction by Subgraph Reasoning
- 🐶 arXiv 2021, BERTRL, Inductive Relation Prediction by BERT
- 🐶 arXiv 2021, TACT, Topology-Aware Correlations Between Relations for Inductive Link Prediction in Knowledge Graphs
- 🐶 arXiv 2021, RDAS Integrating Subgraph-aware Relation and Direction Reasoning for Question Answering
- 🐶 arXiv 2021, DAGN: Discourse-Aware Graph Network for Logical Reasoning
- 🐶 arXiv 2021, Is Multi-Hop Reasoning Really Explainable? Towards Benchmarking Reasoning Interpretability
- 🐶 arXiv 2021, QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering
- 🐶 NAACL 2021, Breadth First Reasoning Graph for Multi-hop Question Answering

### 3.5 Pretraining

- 🐶 arXiv 2021, On the Inductive Bias of Masked Language Modeling: From Statistical to Syntactic Dependencies
- 🐶 arXiv 2021, Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little
- 🐶 NAACL 2021, Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models
- 🐶 arXiv 2021, Are Pre-trained Convolutions Better than Pre-trained Transformers?

### 3.6 Others

- 🐶 arXiv 2021, Heterogeneous Graph Neural Networks for Multi-label Text Classification
- 🐶 arXiv 2021, You Can Do Better! If You Elaborate the Reason When Making Prediction
- 🐶 arXiv 2021, On Position Embeddings in BERT
- 🐶 arXiv 2021, Discrete Reasoning Templates for Natural Language Understanding
- 🐶 arXiv 2021, What Will it Take to Fix Benchmarking in Natural Language Understanding?
- 🐶 EACL 2021, Is the Understanding of Explicit Discourse Relations Required in Machine Reading Comprehension?
- 🐶 arXiv 2021, Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
- 🐶 NAACL 2021, Incorporating Syntax and Semantics in Coreference Resolution with Heterogeneous Graph Attention Network
- 🐶 NAACL 2021, Progressive Generation of Long Text with Pretrained Language Models