Paper List
ACL, EMNLP, COLING, SemEval Paper List
Natural Language Processing
Language Model
A Primer in BERTology: What We Know About How BERT Works (Anna Rogers, TACL 2020)
Pre-trained Models for Natural Language Processing: A Survey (Xipeng Qiu, 2020)
Discrete Word Vector
Class-Based n-gram Models of Natural Language (Peter F. Brown, CL 1992)
Continuous Word Vector
A Neural Probabilistic Language Model (Yoshua Bengio, JMLR 2003, note)
A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning (Ronan Collobert, ICML 2008)
A Scalable Hierarchical Distributed Language Model (Andriy Mnih, NeuIPS 2008)
Natural Language Processing (Almost) from Scratch (Ronan Collobert, JMLR 2011)
Efficient Estimation of Word Representations in Vector Space (Tomas Mikolov, 2013, code, note)
Don’t Count, Predict! A Systematic Comparison of Context-Counting vs. Context-Predicting Semantic Vectors (Marco Baroni, ACL 2014)
GloVe: Global Vectors for Word Representation (Jeffrey Pennington, EMNLP 2014, code, note)
Character-Aware Neural Language Models (Yoon Kim, AAAI 2016, note)
Bag of Tricks for Efficient Text Classification (Armand Joulin, 2016, code, note)
Enriching Word Vectors with Subword Information (Piotr Bojanowski, 2016, note)
Advances in Pre-Training Distributed Word Representations (Tomas Mikolov, 2017)
Learning Word Vectors for 157 Languages (Edouard Grave, LREC 2018)
Learning Chinese Word Embeddings from Stroke, Structure and Pinyin of Characters (Yun Zhang, CIKM 2019)
Glyce: Glyph-vectors for Chinese Character Representations (Yuxian Meng, NeurIPS 2019, code, note)
Obtaining Better Static Word Embeddings Using Contextual Embedding Models (Prakhar Gupta, ACL 2021)
Learning Zero-Shot Multifaceted Visually Grounded Word Embeddings via Multi-Task Training (Hassan Shahmohammadi, 2021)
Sentence/Paragraph/Document Embedding
Distributed Representations of Sentences and Documents (Quoc Le, ICML 2014, code, note)
Skip-Thought Vectors (Ryan Kiros, NeurIPS 2015)
Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval (Hamid Palangi, TASLP 2017, code)
A Structured Self-attentive Sentence Embedding (Zhouhan Lin, ICLR 2017, code, note)
Universal Sentence Encoder (Daniel Cer, 2018, code, note)
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks (Nils Reimers, EMNLP 2019)
On the Sentence Embeddings from Pre-trained Language Models (Bohan Li, EMNLP 2020)
An Unsupervised Sentence Embedding Method by Mutual Information Maximization (Yan Zhang, EMNLP 2020)
Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models (Jianmo Ni, 2021)
Contextual Language Model
Semi-supervised Sequence Learning (Andrew M. Dai, NeurIPS 2015, note)
Semi-supervised Sequence Tagging with Bidirectional Language Models (Matthew E. Peters, ACL 2017, note)
Learned in Translation: Contextualized Word Vectors (Bryan McCann, NeurIPS 2017, code, note)
Deep Contextualized Word Representations (Matthew E. Peters, NAACL 2018, code, note)
Universal Language Model Fine-tuning for Text Classification (Jeremy Howard, ACL 2018, code, note)
Improving Language Understanding by Generative Pre-Training (Alec Radford, 2018, code, note)
SpanBERT: Improving Pre-training by Representing and Predicting Spans (Mandar Joshi, TACL 2019, note)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Jacob Devlin, NAACL 2019, code, note)
What Kind of Language Is Hard to Language-Model? (Sabrina J. Mielke, ACL 2019)
Cloze-driven Pretraining of Self-attention Networks (Alexei Baevski, EMNLP 2019)
Revealing the Dark Secrets of BERT (Olga Kovaleva, EMNLP 2019, note)
Unified Language Model Pre-training for Natural Language Understanding and Generation (Li Dong, NeurIPS 2019, note)
XLNet: Generalized Autoregressive Pretraining for Language Understanding (Zhilin Yang, NeurIPS 2019, code, note)
Pre-Training with Whole Word Masking for Chinese BERT (Yiming Cui, 2019, code, note)
RoBERTa: A Robustly Optimized BERT Pretraining Approach (Yinhan Liu, 2019, note)
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism (Mohammad Shoeybi, 2019, code, note)
Language Models are Unsupervised Multitask Learners (Alec Radford, 2019, code, note)
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (Colin Raffel, 2019, code, note)
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations (Zhenzhong Lan, ICLR 2020, note)
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators (Kevin Clark, ICLR 2020, code, note)
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space (Chunyuan Li, EMNLP 2020, code, note)
Pre-Training Transformers as Energy-Based Cloze Models (Kevin Clark, EMNLP 2020)
Language Models are Few-Shot Learners (Tom B. Brown, 2020, code, note)
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity (William Fedus, 2020)
Are Pre-trained Convolutions Better than Pre-trained Transformers? (Yi Tay, ACL 2021)
True Few-Shot Learning with Language Models (Ethan Perez, 2021)
Multimodal Few-Shot Learning with Frozen Language Models (Maria Tsimpoukelli, 2021)
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing (Pengfei Liu, 2021, code, note)
Prefix-Tuning: Optimizing Continuous Prompts for Generation (Xiang Lisa Li, 2021)
ByT5: Towards a Token-Free Future with Pre-trained Byte-to-Byte Models (Linting Xue, TACL 2022)
Knowledge-Enriched Language Model
ERNIE: Enhanced Language Representation with Informative Entities (Zhengyan Zhang, ACL 2019, code, note)
ERNIE 2.0: A Continual Pre-training Framework for Language Understanding (Yu Sun, AAAI 2020, note)
ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation (Yu Sun, 2021)
Semantics-aware BERT for Language Understanding (Zhuosheng Zhang, AAAI 2020, note)
K-ADAPTER: Infusing Knowledge into Pre-Trained Models with Adapters (Ruize Wang, 2020)
Compressed Language Model
Fine-tune BERT with Sparse Self-Attention Mechanism (Baiyun Cui, EMNLP 2019)
DistilBERT, a Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter (Victor Sahn, 2019, code, note)
AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search (Daoyuan Chen, IJCAI 2020, note)
FastBERT: a Self-distilling BERT with Adaptive Inference Time (Weijie Liu, ACL 2020, note)
Text Classification
One-Class SVMs for Document Classification (Larry M. Manevitz, JMLR 2001)
Convolutional Neural Networks for Sentence Classification (Yoon Kim, EMNLP 2014, code, note)
Recurrent Convolutional Neural Networks for Text Classification (Siwei Lai, AAAI 2015, code, note)
Effective Use of Word Order for Text Categorization with Convolutional Neural Networks (Rie Johnson, NAACL 2015, code)
Deep Unordered Composition Rivals Syntactic Methods for Text Classification (Mohit Iyyer, ACL 2015, code1, code2)
Discriminative Neural Sentence Modeling by Tree-Based Convolution (Lili Mou, EMNLP 2015)
Character-level Convolutional Networks for Text Classification (Xiang Zhang, NeurIPS 2015, code, note)
Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding (Rie Johnson, NeurIPS 2015, code)
Hierarchical Attention Networks for Document Classification (Zichao Yang, NAACL 2016, code, note)
Supervised and Semi-Supervised Text Categorization using LSTM for Region Embeddings (Rie Johnson, ICML 2016, code)
Recurrent Neural Network for Text Classification with Multi-Task Learning (Pengfei Liu, IJCAI 2016, code, note)
Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling (Peng Zhou, COLING 2016)
Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers (Yijun Xiao, 2016)
A Hybrid CNN-RNN Alignment Model for Phrase-Aware Sentence Classification (Shiou Tian Hsu, EACL 2017, note)
Very Deep Convolutional Networks for Text Classification (Alexis Conneau, EACL 2017, code, note)
Adversarial Multi-task Learning for Text Classification (Pengfei Liu, ACL 2017, code, note)
Deep Pyramid Convolutional Neural Networks for Text Categorization (Rie Johnson, ACL 2017, code, note)
Multi-Task Label Embedding for Text Classification (Honglun Zhang, EMNLP 2017, note)
Learning Structured Representation for Text Classification via Reinforcement Learning (Tianyang Zhang, AAAI 2018, code, note)
Translations as Additional Contexts for Sentence Classification (Reinald Kim Amplayo, IJCAI 2018, code)
Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms (Dinghan Shen, ACL 2018, code, note)
Joint Embedding of Words and Labels for Text Classification (Guoyin Yang, ACL 2018, code, note)
Marrying Up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding (Bingfeng Luo, ACL 2018, note)
Graph Convolutional Networks for Text Classification (Liang Yao, AAAI 2019, code, note)
Topics to Avoid: Demoting Latent Confounds in Text Classification (Sachin Kumar, EMNLP 2019)
DocBERT: BERT for Document Classification (Ashutosh Adhikari, 2019, code)
Text Classification Using Label Names Only: A Language Model Self-Training Approach (Yu Meng, EMNLP 2020, code)
Inductive Topic Variational Graph Auto-Encoder for Text Classification (Qianqian Xie, NAACL 2021)
Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification (Shengding Hu, 2021)
Multi-Label Text Classification
Semantic-Unit-Based Dilated Convolution for Multi-Label Text Classification (Junyang Lin, EMNLP 2018, code, note)
SGM: Sequence Generation Model for Multi-Label Classification (Pengcheng Yang, COLING 2018, code, note)
A Deep Reinforced Sequence-to-Set Model for Multi-Label Classification (Pengcheng Yang, ACL 2019, code, note)
AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification (Ronghui You, NeurIPS 2019, code, note)
Text Matching
Learning Deep Structured Semantic Models for Web Search using Clickthrough Data (Po-Sen Huang, CIKM 2013, code, note)
Learning Semantic Representations Using Convolutional Neural Networks for Web Search (Yelong Shen, WWW 2014)
A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval (Yelong Shen, CIKM 2014, code, note)
Convolutional Neural Network Architectures for Matching Natural Language Sentences (Baotian Hu, NeurIPS 2014, code)
Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks (Aliaksei Severyn, SIGIR 2015, code, note)
A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations (Shengxian Wan, 2015, code, note)
ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs (Wenpeng Yin, TACL 2016, code)
Text Matching as Image Recognition (Liang Pang, AAAI 2016, code, note)
Pairwise Word Interaction Modeling with Deep Neural Networks for Semantic Similarity Measurement (Hua He, NAACL 2016, code)
Improved Representation Learning for Question Answer Matching (Ming Tan, ACL 2016, code, note)
A Deep Relevance Matching Model for Ad-hoc Retrieval (Jiafeng Guo, CIKM 2016, code, note)
aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model (Liu Yang, CIKM 2016, code)
A Compare-Aggregate Model for Matching Text Sequences (Shuohang Wang, 2016, code, note)
Learning to Match using Local and Distributed Representations of Text for Web Search (Bhaskar Mitra, WWW 2017, note)
End-to-End Neural Ad-hoc Ranking with Kernel Pooling (Chenyan Xiong, SIGIR 2017, code, note)
Bilateral Multi-Perspective Matching for Natural Language Sentences (Zhiguo Wang, IJCAI 2017, code, note)
Sentence Similarity Learning by Lexical Decomposition and Composition (Zhiguo Wang, COLING 2017, code, note)
Convolutional Neural Networks for Soft-Matching N-Grams in Ad-hoc Search (Zhuyun Dai, WSDM 2018, code, note)
Deep Relevance Ranking Using Enhanced Document-Query Interactions (Ryan McDonald, EMNLP 2018, code, note)
Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information (Seonhoon Kim, AAAI 2019, note)
Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport (Kyle Swanson, ACL 2020)
Natural Language Inference
A Large Annotated Corpus for Learning Natural Language Inference (Samuel R. Bowman, EMNLP 2015, code, note)
Natural Language Inference by Tree-Based Convolution and Heuristic Matching (Lili Mou, ACL 2016, note)
A Decomposable Attention Model for Natural Language Inference (Ankur P. Parikh, EMNLP 2016, code, note)
Enhanced LSTM for Natural Language Inference (Chen Qian, ACL 2017, code, note)
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data (Alexis Conneau, EMNLP 2017, code, note)
Natural Language Inference over Interaction Space (Yichen Gong, ICLR 2018, code, note)
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference (Boyuan Pan, ACL 2018, code, note)
Neural Natural Language Inference Models Enhanced with External Knowledge (Chen Qian, ACL 2018, code)
Improving Natural Language Inference Using External Knowledge in the Science Questions Domain (Xiaoyan Wang, 2018, note)
Gaussian Transformer: A Lightweight Approach for Natural Language Inference (Maosheng Guo, AAAI 2019, code, note)
Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition (Paloma Jeretic, ACL 2020)
Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language? (Hitomi Yanaka, ACL 2020)
Uncertain Natural Language Inference (Tongfei Chen, ACL 2020)
Text Summarization
SEQ3 : Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression (Christos Baziotis, NAACL 2019)
Answers Unite! Unsupervised Metrics for Reinforced Summarization Models (Thomas Scialom, EMNLP 2019, code)
Better Rewards Yield Better Summaries: Learning to Summarise Without References (Florian Bohm, EMNLP 2019, code)
Neural Text Summarization: A Critical Evaluation (Wojciech Kryscinski, EMNLP 2019)
Text Summarization with Pretrained Encoders (Yang Liu, EMNLP 2019, code, note)
What Have We Achieved on Text Summarization? (Dandan Huang, EMNLP 2020)
Re-evaluating Evaluation in Text Summarization (Manik Bhandari, EMNLP 2020, code)
Unsupervised Reference-Free Summary Quality Evaluation via Contrastive Learning (Hanlu Wu, EMNLP 2020, code)
The Style-Content Duality of Attractiveness: Learning to Write Eye-Catching Headlines via Disentanglement (Mingzhe Li, AAAI 2021)
Extractive Summarization
The Automatic Creation of Literature Abstracts (H. P. Luhn, 1958, code)
New Methods in Automatic Extracting (H. P. Edmundson, 1969, code)
TextRank: Bringing Order into Texts (Rada Mihalcea, EMNLP 2004, code, note)
Using Latent Semantic Analysis in Text Summarization and Summary Evaluation (Josef Steinberger, 2004, code)
LexRank: Graph-based Lexical Centrality as Salience in Text Summarization (Gunes Erkan, 2004, code)
Beyond SumBasic: Task-Focused Summarization with Sentence Simplification and Lexical Expansion (Lucy Vanderwende, 2007, code)
SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents (Ramesh Nallapati, AAAI 2017)
Ranking Sentences for Extractive Summarization with Reinforcement Learning (Shashi Narayan, NAACL 2018, code)
Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies (Max Grusky, NAACL 2018, code)
BanditSum: Extractive Summarization as a Contextual Bandit (Yue Dong, EMNLP 2018, code)
Neural Latent Extractive Document Summarization (Xingxing Zhang, EMNLP 2018)
Guiding Extractive Summarization with Question-Answering Rewards (Kristjan Arumae, NAACL 2019, code)
Single Document Summarization as Tree Induction (Yang Liu, NAACL 2019)
HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization (Xingxing Zhang, ACL 2019, note)
Searching for Effective Neural Extractive Summarization: What Works and What’s Next (Ming Zhong, ACL 2019, code)
Neural Extractive Text Summarization with Syntactic Compression (Jiacheng Xu, EMNLP 2019, code)
Extractive Summarization as Text Matching (MIng Zhong, ACL 2020, code, note)
Discourse-Aware Neural Extractive Text Summarization (Jiacheng Xu, ACL 2020, code, note)
Heterogeneous Graph Neural Networks for Extractive Document Summarization (Danqing Wang, ACL 2020, code, note)
Abstractive Summarization
A Neural Attention Model for Abstractive Sentence Summarization (Alexander M. Rush, EMNLP 2015, code)
Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond (Ramesh Nallapati, CoNLL 2016, note)
Abstractive Document Summarization with a Graph-Based Attentional Neural Model (Jiwei Tan, ACL 2017)
Get To The Point: Summarization with Pointer-Generator Networks (Abigail See, ACL 2017)
A Deep Reinforced Model for Abstractive Summarization (Romain Paulus, 2017, code, note)
Deep Communicating Agents for Abstractive Summarization (Asli Celikyilmaz, NAACL 2018, code)
A Reinforced Topic-Aware Convolutional Sequence-to-Sequence Model for Abstractive Text Summarization (Li Wang, IJCAI 2018, note)
Controllable Abstractive Summarization (Angela Fan, ACL 2018)
Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting (Yen-Chun Chen, ACL 2018, code)
Bottom-Up Abstractive Summarization (Sebastian Gehrmann, EMNLP 2018, code)
Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization (Shashi Narayan, EMNLP 2018, code, note)
Abstractive Summarization: A Survey of the State of the Art (Hui Lin, AAAI 2019)
Abstractive Summarization of Reddit Posts with Multi-level Memory Networks (Byeongchang Kim, NAACL 2019, code, note)
Scoring Sentence Singletons and Pairs for Abstractive Summarization (Logan Lebanoff, ACL 2019, code, note)
How to Write Summaries with Patterns? Learning towards Abstractive Summarization through Prototype Editing (Shen Gao, EMNLP 2019, code, note)
Controlling the Amount of Verbatim Copying in Abstractive Summarization (Kaiqiang Song, AAAI 2020)
Joint Parsing and Generation for Abstractive Summarization (Kaiqiang Song, AAAI 2020)
Keywords-Guided Abstractive Sentence Summarization (Haoran Li, AAAI 2020)
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization (Jingqing Zhang, ICML 2020, code, note)
Discriminative Adversarial Search for Abstractive Summarization (Thomas Scialom, ICML 2020)
Fact-based Content Weighting for Evaluating Abstractive Summarisation (Xinnuo Xu, ACL 2020)
On Faithfulness and Factuality in Abstractive Summarization (Joshua Maynez, ACL 2020, code)
Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports (Yuhao Zhang, ACL 2020, note)
Self-Attention Guided Copy Mechanism for Abstractive Summarization (Song Xu, ACL 2020)
FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization (Esin Durmus, ACL 2020)
Asking and Answering Questions to Evaluate the Factual Consistency of Summaries (Alex Wang, ACL 2020, note)
The Summary Loop: Learning to Write Abstractive Summaries Without Examples (Philippe Laban, ACL 2020)
Evaluating the Factual Consistency of Abstractive Text Summarization (Wojciech Kryscinski, EMNLP 2020)
Reducing Quantity Hallucinations in Abstractive Summarization (Zheng Zhao, EMNLP 2020 Findings)
Learning to Summarize from Human Feedback (Nisan Stiennon, NeurIPS 2020)
Multi-Document Summarization
Exploring Content Models for Multi-Document Summarization (Aria Haghighi, NAACL 2009, code)
Adapting the Neural Encoder-Decoder Framework from Single to Multi-Document Summarization (Logan Lebanoff, EMNLP 2018, code)
Abstractive Multi-Document Summarization Based on Semantic Link Network (Wei Li, TKDE 2019)
Hierarchical Transformers for Multi-Document Summarization (Yang Liu, ACL 2019)
Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization (Sangwoo Cho, ACL 2019, code, note)
Multi-News: a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model (Alexander R. Fabbri, ACL 2019, code, note)
Leveraging Graph to Improve Abstractive Multi-Document Summarization (Wei Li, ACL 2020, code)
Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning (Yuning Mao, EMNLP 2020, code)
Opinion Summarization
Unsupervised Opinion Summarization with Noising and Denoising (Reinald Kim Amplayo, ACL 2020)
Unsupervised Opinion Summarization with Content Planning (Reinald Kim Amplayo, AAAI 2021)
Cross-Lingual Summarization
Attend, Translate and Summarize: An Efficient Method for Neural Cross-Lingual Summarization (Junnan Zhu, ACL 2020)
Text Style Transfer
Style Transfer from Non-Parallel Text by Cross-Alignment (Tianxiao Shen, NeurIPS 2017, code)
Style Transfer in Text: Exploration and Evaluation (Zhenxin Fu, AAAI 2018, code, note)
Delete, Retrieve, Generate: A Simple Approach to Sentiment and Style Transfer (Juncen Li, NAACL 2018, code, note)
Style Transfer Through Back-Translation (Shrimai Prabhumoye, ACL 2018, code)
Unsupervised Text Style Transfer using Language Models as Discriminators (Zichao Yang, NeurIPS 2018, code)
Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus (Hongyu Gong, NAACL 2019, code)
A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer (Fuli Luo, IJCAI 2019, code)
Mask and Infill: Applying Masked Language Model for Sentiment Transfer (Xing Wu, IJCAI 2019, code)
Disentangled Representation Learning for Non-Parallel Text Style Transfer (Vineet John, ACL 2019, code, note)
A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer (Chen Wu, ACL 2019, code, note)
Style Transformer: Unpaired Text Style Transfer without Disentangled Latent Representation (Ning Dai, ACL 2019, code, note)
Semi-supervised Text Style Transfer: Cross Projection in Latent Space (Mingyue Shang, EMNLP 2019)
Multiple-Attribute Text Style Transfer (Guillaume Lample, ICLR 2019)
Topic Modeling
Unsupervised Topic Modeling
Indexing by Latent Semantic Analysis (Scott Deerwester, 1990)
An Introduction to Latent Semantic Analysis (Thomas K Landauer, 1998, code, note)
Probabilistic Latent Semantic Analysis (Thomas K Landauer, 1999, code, note)
Latent Dirichlet Allocation (David M. Blei, JMLR 2003, code, note)
Correlated Topic Models (David M. Blei, NeurIPS 2005)
A Neural Autoregressive Topic Model (Hugo Larochelle, NeurIPS 2012, code)
LightLDA: Big Topic Models on Modest Computer Clusters (Jinhui Yuan, WWW 2015, code, note)
Short and Sparse Text Topic Modeling via Self-Aggregation (Xiaojun Quan, IJCAI 2015, code)
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec (Christopher Moody, 2016, code, note)
Topic Modeling of Short Texts: A Pseudo-Document View (Yuan Zuo, KDD 2016)
A Word Embeddings Informed Focused Topic Model (He Zhao, ACML 2017)
Incorporating Knowledge Graph Embeddings into Topic Modeling (Liang Yao, AAAI 2017, note)
ASTM: An Attentional Segmentation Based Topic Model for Short Texts (Jiamiao Wang, ICDM 2018, code)
Short-Text Topic Modeling via Non-negative Matrix Factorization Enriched with Local Word-Context Correlations (Tian Shi, WWW 2018, code)
Improving Topic Quality by Promoting Named Entities in Topic Modeling (Katsiaryna Krasnashchok, ACL 2018)
Inter and Intra Topic Structure Learning with Word Embeddings (He Zhao, ICML 2018, code)
Document Informed Neural Autoregressive Topic Models with Distributional Prior (Pankaj Gupta, AAAI 2019, code)
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior (Pankaj Gupta, ICLR 2019, code)
CluWords: Exploiting Semantic Word Clustering Representation for Enhanced Topic Modeling (Felipe Viegas, WSDM 2019)
The Dynamic Embedded Topic Model (Adji B. Dieng, 2019, code)
Topic Modeling in Embedding Spaces (Adji B. Dieng, 2019, code, note)
Neural Mixed Counting Models for Dispersed Topic Discovery (Jiemin Wu, ACL 2020)
Graph Attention Topic Modeling Network (Liang Yang, WWW 2020)
Supervised Topic Modeling
Supervised Topic Models (David M. Blei, NeurIPS 2008)
Labeled LDA: A Supervised Topic Model for Credit Attribution in Multi-Labeled Corpora (Daniel Ramage, EMNLP 2009, code, note)
DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification (Simon Lacoste-Julien, NeurIPS 2009, code)
Replicated Softmax: an Undirected Topic Model (Ruslan Salakhutdinov, NeurIPS 2009)
Partially Labeled Topic Models for Interpretable Text Mining (Daniel Ramage, KDD 2011)
MedLDA: Maximum Margin Supervised Topic Models (Jun Zhu, JMLR 2013)
A Biterm Topic Model for Short Texts (Xiaohui Yan, WWW 2013, code, note)
BTM: Topic Modeling over Short Texts (Xueqi Chen, TKDE 2014)
Efficient Methods for Incorporating Knowledge into Topic Models (Yi Yang, EMNLP 2015)
Improving Topic Models with Latent Feature Word Representations (Dat Quoc Nguyen, TACL 2015, code)
Topic Modeling for Short Texts with Auxiliary Word Embeddings (Chenliang Li, SIGIR 2016, code)
Efficient Correlated Topic Modeling with Topic Embedding (Junxian He, KDD 2017)
Adapting Topic Models using Lexical Associations with Tree Priors (Weiwei Yang, EMNLP 2017)
MetaLDA: A Topic Model that Efficiently Incorporates Meta Information (He Zhao, ICDM 2017, code)
Anchored Correlation Explanation: Topic Modeling with Minimal Domain Knowledge (Ryan J. Gallagher, TACL 2017, code)
PhraseCTM: Correlated Topic Modeling on Phrases within Markov Random Fields (Weijie Huang, ACL 2018)
Dirichlet Belief Networks for Topic Structure Learning (He Zhao, NeurIPS 2018)
Discriminative Topic Mining via Category-Name Guided Text Embedding (Yu Meng, WWW 2020, code)
Keyphrase Extraction
TextRank: Bringing Order into Texts (Rada Mihalcea, EMNLP 2014, code, note)
Deep Keyphrase Generation (Rui Meng, ACL 2017, code, note)
Semi-Supervised Learning for Neural Keyphrase Generation (Hai Ye, EMNLP 2018)
Keyphrase Generation with Correlation Constraints (Jun Chen, EMNLP 2018)
Title-Guided Encoding for Keyphrase Generation (Wang Chen, AAAI 2019)
An Integrated Approach for Keyphrase Generation via Exploring the Power of Retrieval and Extraction (Wang Chen, NAACL 2019, code, note)
Glocal: Incorporating Global Information in Local Convolution for Keyphrase Extraction (Animesh Prasad, NAACL 2019)
Keyphrase Generation: A Text Summarization Struggle (Erion Cano, NAACL 2019)
Incorporating Linguistic Constraints into Keyphrase Generation (Jing Zhao, ACL 2019)
Topic-Aware Neural Keyphrase Generation for Social Media Language (Yue Wang, ACL 2019, code)
Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards (Hou Pong Chan, ACL 2019, code, note)
Using Human Attention to Extract Keyphrase from Microblog Post (Yingyi Zhang, ACL 2019, note)
Open Domain Web Keyphrase Extraction Beyond Language Modeling (Lee Xiong, EMNLP 2019)
Word Segmentation
Adversarial Multi-Criteria Learning for Chinese Word Segmentation (Xinchi Chen, ACL 2017)
State-of-the-art Chinese Word Segmentation with Bi-LSTMs (Ji Ma, 2018, code)
Improving Chinese Word Segmentation with Wordhood Memory Networks (Yuanhe Tian, ACL 2020, code)
A Concise Model for Multi-Criteria Chinese Word Segmentation with Transformer Encoder (Xipeng Qiu, EMNLP 2020, code)
Towards Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning (Weipeng Huang, COLING 2020)
Spelling Correction
A Spelling Correction Program Based on a Noisy Channel Model (Mark D. Kemighan, COLING 1990)
Context Based Spelling Correction (Eric Mays, 1991)
Structured Prediction
Structured Prediction as Translation Between Augmented Natural Languages (Giovanni Paolini, ICLR 2021)
Sequence Labeling
End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF (Xuezhe Ma, ACL 2016, code, note)
Transfer Learning for Sequence Tagging with Hierarchical Recurrent Networks (Zhilin Yang, ICLR 2017, code, note)
Semi-supervised Multitask Learning for Sequence Labeling (Marek Rei, ACL 2017, note)
Semi-supervised Sequence Tagging with Bidirectional Language Models (Matthew E. Peters, ACL 2017, note)
Empower Sequence Labeling with Task-Aware Neural Language Model (Liyuan Liu, AAAI 2018, code, note)
Contextual String Embeddings for Sequence Labeling (Alan Akbik, COLING 2018, code, note)
Hierarchically-Refined Label Attention Network for Sequence Labeling (Leyang Cui, EMNLP 2019, code, note)
Part-Of-Speech Tagging
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network (Kristina Toutanova, NAACL 2013)
Part-of-Speech Tagging from 97% to 100%: Is It Time for Some Linguistics? (Christopher D. Manning, 2011)
Learning Character-level Representations for Part-of-Speech Tagging (C´ıcero Nogueira dos Santos, ICML 2014)
Semantic Role Labeling
The Berkeley FrameNet Project (Collin F. Baker, ACL 1998)
Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling (Xavier Carreras, 2005)
End-to-end Learning of Semantic Role Labeling Using Recurrent Neural Networks (Jie Zhou, ACL 2015, code)
Deep Semantic Role Labeling: What Works and What’s Next (Luheng He, ACL 2017, code, note)
Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling (Diego Marcheggiani, EMNLP 2017, code, note)
A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling (Diego Marcheggiani, 2017, code)
Deep Semantic Role Labeling with Self-Attention (Zhixing Tan, AAAI 2018, code, note)
Linguistically-Informed Self-Attention for Semantic Role Labeling (Emma Strubell, EMNLP 2018, code)
A Span Selection Model for Semantic Role Labeling (Hiroki Ouchi, EMNLP 2018, code, note)
Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling (Luheng He, ACL 2018, code, note)
Dependency or Span, End-to-End Uniform Semantic Role Labeling Sentiment Analysis (Zuchao Li, AAAI 2019, code, note)
Semantic Role Labeling with Associated Memory Network (Chaoyu Guan, NAACL 2019, code)
Entity and Relation Extraction
Entity Extraction
Chinese NER Using Lattice LSTM (Yue Zhang, ACL 2018)
Leverage Lexical Knowledge for Chinese Named Entity Recognition via Collaborative Graph Network (Dianbo Sui, EMNLP 2019)
A Survey on Deep Learning for Named Entity Recognition (Jing Li, TKDE 2020)
A Unified MRC Framework for Named Entity Recognition (Xiaoya Li, ACL 2020, code)
SPANNER: Named Entity Re-/Recognition as Span Prediction (Jinlan Fu, ACL 2021)
Discontinuous Named Entity Recognition as Maximal Clique Discovery (Yucheng Wang, ACL 2021)
Relation Extraction
Kernel Methods for Relation Extraction (Dmitry Zelenko, JMLR 2003)
A Rich Feature Vector for Protein-Protein Interaction Extraction from Multiple Corpora (Makoto Miwa, EMNLP 2009)
Exploiting Syntactico-Semantic Structures for Relation Extraction (Yee Seng Chan, ACL 2011)
Relation Classification via Convolutional Deep Neural Network (Daojian Zeng, COLING 2014)
Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification (Peng Zhou, ACL 2016)
End-to-End Relation Extraction using LSTMs on Sequences and Tree Structures (Makoto Miwa, ACL 2016)
Relation Classification via Multi-Level Attention CNNs (Linlin Wang, ACL 2016)
Neural Relation Extraction with Selective Attention over Instances (Yankai Lin, ACL 2016)
Reinforcement Learning for Relation Classification From Noisy Data (Jun Feng, AAAI 2018)
Graph Convolution over Pruned Dependency Trees Improves Relation Extraction (Yuhao Zhang, EMNLP 2018)
A Hierarchical Framework for Relation Extraction with Reinforcement Learning (Ryuichi Takanobu, AAAI 2019)
Attention Guided Graph Convolutional Networks for Relation Extraction (Zhijiang Guo, ACL 2019)
Reasoning with Latent Structure Refinement for Document-Level Relation Extraction (Guoshun Nan, ACL 2020)
Double Graph Based Reasoning for Document-level Relation Extraction (Shuang Zeng, EMNLP 2020)
Document-level Relation Extraction as Semantic Segmentation (Ningyu Zhang, IJCAI 2021)
KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction (Xiang Chen, WWW 2022)
Joint Entity and Relation Extraction
Incremental Joint Extraction of Entity Mentions and Relations (Qi Li, ACL 2014)
Modeling Joint Entity and Relation Extraction with Table Representation (Makoto Miwa, EMNLP 2014)
Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme (Songcong Zheng, 2017)
GraphRel: Modeling Text as Relational Graphs for Joint Entity and Relation Extraction (Tsu-Jui Fu, ACL 2019)
Entity-Relation Extraction as Multi-turn Question Answering (Xiaoya Li, ACL 2019)
Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction (Tapas Nayak, AAAI 2020)
Span-based Joint Entity and Relation Extraction with Transformer Pre-training (Markus Eberts, ECAI 2020)
Two are Better than One: Joint Entity and Relation Extraction with Table-Sequence Encoders (Jue Wang, EMNLP 2020)
TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking (Yucheng Wang, COLING 2020)
A Frustratingly Easy Approach for Entity and Relation Extraction (Zexuan Zhong, NAACL 2021)
Dependency Parsing
Statistical Dependency Analysis with Support Vector machines (Hiroyasu Yamada, 2003, code)
A Dynamic Oracle for Arc-Eager Dependency Parsing (Yoav Goldberg, COLING 2012, code)
Training Deterministic Parsers with Non-Deterministic Oracles (Yoav Goldberg, TACL 2013, code)
A Fast and Accurate Dependency Parser using Neural Networks (Danqi Chen, EMNLP 2014, code, note)
An Improved Non-monotonic Transition System for Dependency Parsing (Matthew Honnibal, EMNLP 2015)
Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations (Eliyahu Kiperwasser, TACL 2016, code, note)
Deep Biaffine Attention for Neural Dependency Parsing (Timothy Dozat, ICLR 2017, code, note)
Deep Multitask Learning for Semantic Dependency Parsing (Hao Peng, ACL 2017, code, note)
Simpler but More Accurate Semantic Dependency Parsing (Timothy Dozat, ACL 2018, code)
Multi-Task Semantic Dependency Parsing with Policy Gradient for Learning Easy-First Strategies (Shuhei Kurita, ACL 2019)
Sentiment Analysis
Overview
Opinion Mining and Sentiment Analysis (Bo Pang, 2008)
Sentiment Analysis and Opinion Mining (Bing Liu, 2012)
Dataset
SemEval-2014 Task 4: Aspect Based Sentiment Analysis (Maria Pontiki, SemEval 2014, code, note)
SemEval-2015 Task 12: Aspect Based Sentiment Analysis (Maria Pontiki, SemEval 2015, code, note)
SemEval-2016 Task 5: Aspect Based Sentiment Analysis (Maria Pontiki, SemEval 2016, code)
Sentiment Lexicon
Building Large-Scale Twitter-Specific Sentiment Lexicon: A Representation Learning Approach (Duyu Tang, COLING 2014)
Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora (William L. Hamilton, EMNLP 2016)
Sentiment Embedding
Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification (Duyu Tang, ACL 2014, note)
Sentiment Embeddings with Applications to Sentiment Analysis (Duyu Tang, TKDE 2015)
Refining Word Embeddings for Sentiment Analysis (Liang-Chih Yu, EMNLP 2017)
SenticNet 5: Discovering Conceptual Primitives for Sentiment Analysis by Means of Context Embeddings (Erik Cambria, AAAI 2018, code1, code2)
Emo2Vec: Learning Generalized Emotion Representation by Multi-task Training (Peng Xu, EMNLP 2018 Workshop)
Learning Emotion-enriched Word Representations (Ameeta Agrawal, COLING 2018)
Distributed Representations of Emotion Categories in Emotion Space (Xiangyu Wang, ACL 2021)
Sentiment Classification
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank (Richard Socher, EMNLP 2013)
Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts (C´ıcero Nogueira dos Santos, COLING 2014)
Document Modeling with Gated Recurrent Neural Network for Sentiment Classification (Duyu Tang, EMNLP 2015)
Cached Long Short-Term Memory Neural Networks for Document-Level Sentiment Classification (Jiacheng Xu, EMNLP 2016)
Neural Sentiment Classification with User and Product Attention (Huimin Chen, EMNLP 2016)
Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts (Xingyou Wang, COLING 2016)
A Cognition Based Attention Model for Sentiment Analysis (Yunfei Long, EMNLP 2017)
Improving Review Representations with User Attention and Product Attention for Sentiment Classification (Zhen Wu, AAAI 2018)
SNNN: Promoting Word Sentiment and Negation in Neural Sentiment Classification (Qinmin Hu, AAAI 2018)
A Helping Hand: Transfer Learning for Deep Sentiment Analysis (Xin Dong, ACL 2018)
Cold-Start Aware User and Product Attention for Sentiment Classification (Reinald Kim Amplayo, ACL 2018)
A Lexicon-Based Supervised Attention Model for Neural Sentiment Analysis (Yicheng Zou, COLING 2018)
Neural Review Rating Prediction with User and Product Memory (Zhiguan Yuan, CIKM 2019)
Sentiment Lexicon Enhanced Neural Sentiment Classification (Chuhan Wu, CIKM 2019)
Opinion Target Extraction
Mining and Summarizing Customer Reviews (Minqing Hu, KDD 2004, note)
Extracting Product Features and Opinions from Reviews (Ana-Maria Popescu, EMNLP 2005)
Modeling Online Reviews with Multi-grain Topic Models (Ivan Titov, WWW 2008, code)
Phrase Dependency Parsing for Opinion Mining (Yuanbin Wu, EMNLP 2009)
A Novel Lexicalized HMM-based Learning Framework for Web Opinion Mining (Wei Jin, ICML 2009)
Structure-Aware Review Mining and Summarization (Fangtao Li, COLING 2010, note)
Opinion Target Extraction in Chinese News Comments (Tengfei Ma, COLING 2010)
Extracting Opinion Targets in a Single- and Cross-Domain Setting (Niklas Jakob, EMNLP 2010)
Opinion Word Expansion and Target Extraction through Double Propagation (Guang Qiu, CL 2011)
Opinion Target Extraction Using Word-Based Translation Model (Kang Liu, EMNLP 2012)
Opinion Target Extraction Using Partially-Supervised Word Alignment Model (Kang Liu, IJCAI 2013)
Exploiting Domain Knowledge in Aspect Extraction (Zhiyuan Chen, EMNLP 2013)
Recursive Neural Conditional Random Fields for Aspect-based Sentiment Analysis (Wenya Wang, EMNLP 2016, code, note)
Improving Opinion Aspect Extraction Using Semantic Similarity and Aspect Associations (Qian Liu, AAAI 2016, note)
Unsupervised word and dependency path embeddings for aspect term extraction (Yichun Yin, IJCAI 2016, note)
Coupled Multi-Layer Attentions for Co-Extraction of Aspect and Opinion Terms (Wenya Wang, AAAI 2017, code, note)
Recurrent Neural Networks with Auxiliary Labels for Cross-Domain Opinion Target Extraction (Ying Ding, AAAI 2017)
Multi-task Memory Networks for Category-specific Aspect and Opinion Terms Co-extraction (Wenya Wang, 2017)
An Unsupervised Neural Attention Model for Aspect Extraction (Ruidan He, ACL 2017, code, note)
Lifelong Learning CRF for Supervised Aspect Extraction (Lei Shu, ACL 2017)
Deep Multi-Task Learning for Aspect Term Extraction with Memory Interaction (Xin Li, EMNLP 2017, note)
Aspect Term Extraction with History Attention and Selective Transformation (Xin Li, IJCAI 2018, code, note)
Double Embeddings and CNN-based Sequence Labeling for Aspect Extraction (Hu Xu, ACL 2018, code, note)
ExtRA: Extracting Prominent Review Aspects from Customer Feedback (Zhiyi Luo, EMNLP 2018, note)
Target-oriented Opinion Words Extraction with Target-fused Neural Sequence Labeling (Zhifang Fan, NAACL 2019, code, note)
Aspect-Based Sentiment Classification
Target-dependent twitter sentiment classification (Long Jiang, ACL 2011)
Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification (Li Dong, ACL 2014, note)
Target-dependent twitter sentiment classification with rich automatic features (Duy-Tin Vo, IJCAI 2015, code)
Effective LSTMs for Target-Dependent Sentiment Classification (Duyu Tang, COLING 2016, code, note)
Attention-based LSTM for Aspect-level Sentiment Classification (Yequan Wang, EMNLP 2016, code, note)
Aspect Level Sentiment Classification with Deep Memory Network (Duyu Tang, EMNLP 2016, code, note)
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis (Sebastian Ruder, EMNLP 2016, note)
Interactive Attention Networks for Aspect-Level Sentiment Classification (Dehong Ma, IJCAI 2017, code, note)
Recurrent Attention Network on Memory for Aspect Sentiment Analysis (Peng Chen, EMNLP 2017, code, note)
Attention Modeling for Targeted Sentiment (Jiangming Liu, EACL 2017, code)
Targeted Aspect-Based Sentiment Analysis via Embedding Commonsense Knowledge into an Attentive LSTM (Yukun Ma, AAAI 2018, note)
Modeling Inter-Aspect Dependencies for Aspect-Based Sentiment Analysis (Devamanyu Hazarika, NAACL 2018, code, note)
IARM: Inter-aspect relation modeling with memory networks in aspect-based sentiment analysis (Navonil Majumder, EMNLP 2018, code)
Content Attention Model for Aspect Based Sentiment Analysis (Qiao Liu, WWW 2018, code1, code2, note)
Convolution-based Memory Network for Aspect-based Sentiment Analysis (Chuang Fan, SIGIR 2018)
Aspect Based Sentiment Analysis with Gated Convolutional Networks (Wei Xue, ACL 2018, code, note)
Exploiting Document Knowledge for Aspect-level Sentiment Classification (Ruidan He, ACL 2018, code, note)
Transformation Networks for Target-Oriented Sentiment Classification (Xin Li, ACL 2018, code, note)
Multi-grained Attention Network for Aspect-Level Sentiment Classification (Feifan Fan, EMNLP 2018, note)
A Position-aware Bidirectional Attention Network for Aspect-level Sentiment Analysis (Shuqin Gu, COLING 2018, code, note)
Attentional Encoder Network for Targeted Sentiment Classification (Youwei Song, 2019, code, note)
A Human-Like Semantic Cognition Network for Aspect-Level Sentiment Classification (Zeyang Lei, AAAI 2019, code)
Adapting BERT for Target-Oriented Multimodal Sentiment Classification (Jianfei Yu, IJCAI 2019, code, note)
Deep Mask Memory Networks with Semantic Dependency and Context Moment for Aspect-based Sentiment Analysis (Peiqin Lin, IJCAI 2019, code, note)
BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis (Hu Xu, NAACL 2019, code, note)
Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence (Chi Sun, NAACL 2019, code, note)
Replicate, Walk, and Stop on Syntax: an Effective Neural Network Model for Aspect-Level Sentiment Classification (Yaowei Zheng, AAAI 2020, code)
Inducing Target-Specific Latent Structures for Aspect Sentiment Classification (Chenhua Chen, EMNLP 2020, note)
Tasty Burgers, Soggy Fries: Probing Aspect Robustness in Aspect-Based Sentiment Analysis (Xiaoyu Xing, EMNLP 2020)
Interventional Aspect-Based Sentiment Analysis (Zhen Bi, 2021)
Aspect-Based Sentiment Analysis
A Joint Model of Text and Aspect Ratings for Sentiment Summarization (Ivan Titov, ACL 2008)
Bidirectional Inter-dependencies of Subjective Expressions and Targets and their Value for a Joint Model (Roman Klinger, ACL 2013)
Joint Inference for Fine-grained Opinion Extraction (Bishan Yang, ACL 2013)
Open Domain Targeted Sentiment (Margaret Mitchell, EMNLP 2013)
Joint Modeling of Opinion Expression Extraction and Attribute Classification (Bishan Yang, TACL 2014)
Neural Networks for Open Domain Targeted Sentiment (Meishan Zhang, EMNLP 2015, code)
Learning Latent Sentiment Scopes for Entity-Level Sentiment Analysis (Hao Li, AAAI 2017)
Joint Learning for Targeted Sentiment Analysis (Dehong Ma, EMNLP 2018)
A Unified Model for Opinion Target Extraction and Target Sentiment Prediction (Xin Li, AAAI 2019, code, note)
A Span-based Joint Model for Opinion Target Extraction and Target Sentiment Classification (Yan Zhou, IJCAI 2019)
An Interactive Multi-Task Learning Network for End-to-End Aspect-Based Sentiment Analysis (Ruidan He, ACL 2019, code, note)
DOER: Dual Cross-Shared RNN for Aspect Term-Polarity Co-Extraction (Huaishao Luo, ACL 2019, code, note)
Open-Domain Targeted Sentiment Analysis via Span-Based Extraction and Classification (Minghao Hu, ACL 2019, code, note)
A Shared-Private Representation Model with Coarse-to-Fine Extraction for Target Sentiment Analysis (Peiqin Lin, EMNLP 2020 Findings, code, note)
Understanding Pre-trained BERT for Aspect-based Sentiment Analysis (Hu Xu, COLING 2020, code)
A Unified Generative Framework for Aspect-Based Sentiment Analysis (Hang Yan, ACL 2021)
Emotion Cause Detection
Emotion Cause Events: Corpus Construction and Analysis (Sophia Yat Mei Lee, LREC 2010)
A Text-driven Rule-based System for Emotion Cause Detection (Sophia Yat Mei Lee, NAACL 2010)
Emotion Cause Detection with Linguistic Constructions (Ying Chen, COLING 2010)
EMOCause: An Easy-adaptable Approach to Emotion Cause Contexts (Irene Russo, 2011)
Text-based Emotion Classification Using Emotion Cause Extraction (Weiyuan Li, 2013)
Event-Driven Emotion Cause Extraction with Corpus Construction (Lin Gui, EMNLP 2016)
A Question Answering Approach to Emotion Cause Extraction (Liu Gui, EMNLP 2017)
A Co-Attention Neural Network Model for Emotion Cause Analysis with Emotional Context Awareness (Xiangju Li, EMNLP 2018)
Who Feels What and Why? Annotation of a Literature Corpus with Semantic Roles of Emotions (Evgeny Kim, COLING 2018)
Context-Aware Emotion Cause Analysis with Multi-Attention-Based Neural Network (Xiangju Li, KBS 2019)
Multiple Level Hierarchical Network-Based Clause Selection for Emotion Cause Extraction (Xinyi Yu, IEEE Access 2019)
From Independent Prediction to Reordered Prediction: Integrating Relative Position and Global Label Information to Emotion Cause Identification (Zixiang Ding, AAAI 2019, code, note)
RTHN: A RNN-Transformer Hierarchical Network for Emotion Cause Extraction (Rui Xia, IJCAI 2019, code)
A Knowledge Regularized Hierarchical Approach for Emotion Cause Analysis (Chuang Fan, EMNLP 2019)
Position Bias Mitigation: A Knowledge-Aware Graph Model for Emotion Cause Extraction (Hanqi Yan, ACL 2021, code)
Emotion Cause Analysis
Joint Learning for Emotion Classification and Emotion Cause Detection (Ying Chen, EMNLP 2018)
Emotion-Cause Pair Extraction: A New Task to Emotion Analysis in Texts (Rui Xia, ACL 2019, code, note)
ECPE-2D: Emotion-Cause Pair Extraction based on Joint Two-Dimensional Representation, Interaction and Prediction (Zixiang Ding, ACL 2020, code)
Dialogue System
A Survey on Dialogue Systems: Recent Advances and New Frontiers (Hongshen Chen, 2017, note)
AliMe Chat: A Sequence to Sequence and Rerank based Chatbot Engine (Minghui Qiu, ACL 2017, note)
Neural Approaches to Conversational AI (Jianfeng Gao, SIGIR 2018)
The Design and Implementation of XiaoIce, an Empathetic Social Chatbot (Li Zhou, CL 2020)
Challenges in Building Intelligent Open-domain Dialog Systems (Minlie Huang, TIS 2020)
Towards a Human-like Open-Domain Chatbot (Daniel Adiwardana, 2020, code)
Dataset
MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling (Paweł Budzianowski, EMNLP 2018, code)
Towards Exploiting Background Knowledge for Building Conversation Systems (Mikita Moghe, EMNLP 2018, code)
Training Millions of Personalized Dialogue Agents (Pierre-Emmanuel Mazare, EMNLP 2018)
MultiWOZ 2.1: Multi-Domain Dialogue State Corrections and State Tracking Baselines (Mihail Eric, 2019, code)
Wizard of Wikipedia: Knowledge-Powered Conversational Agents (Emily Dinan, ICLR 2019)
MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations (Soujanya Poria, ACL 2019, code)
How to Build User Simulators to Train RL-based Dialog Systems (Weiyan Shi, EMNLP 2019, code)
Towards Scalable Multi-domain Conversational Agents: The Schema-Guided Dialogue Dataset (Abhinav Rastogi, AAAI 2020, code, note)
CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset (Qi Zhu, TACL 2020, code, note)
A Large-Scale Chinese Short-Text Conversation Dataset (Yida Wang, NLPCC 2020, code)
Dialogue State Tracking
The Second Dialog State Tracking Challenge (Matthew Henderson, SIGDAIL 2014, code)
Word-Based Dialog State Tracking with Recurrent Neural Networks (Matthew Henderson, SIGDAIL 2014)
Machine Learning for Dialog State Tracking: A Review (Matthew Henderson, 2015)
The Dialog State Tracking Challenge Series: A Review (Jason D. Williams 2016)
A Network-based End-to-End Trainable Task-oriented Dialogue System (Tsung-Hsien Wen, EACL 2017, code, note)
Neural Belief Tracker: Data-Driven Dialogue State Tracking (Nikola Mrksic, ACL 2017, code, note)
Fully Statistical Neural Belief Tracking (Nikola Mrksic, ACL 2018, code)
Global-Locally Self-Attentive Dialogue State Tracker (Victor Zhong, ACL 2018, code)
Towards Universal Dialogue State Tracking (Liliang Ren, EMNLP 2018, code)
Dialog State Tracking: A Neural Reading Comprehension Approach (Shuyang Gao, SIGDIAL 2019)
Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems (Chien-Sheng Wu, ACL 2019, code, note)
SUMBT: Slot-Utterance Matching for Universal and Scalable Belief Tracking (Hwaran Lee, ACL 2019)
Scalable and Accurate Dialogue State Tracking via Hierarchical Sequence Generation (Liliang Ren, EMNLP 2019)
HyST: A Hybrid Approach for Flexible and Accurate Dialogue State Tracking (Rahul Goel, Interspeech 2019)
Efficient Dialogue State Tracking by Selectively Overwriting Memory (Sungdong Kim, ACL 2020)
Parallel Interactive Networks for Multi-Domain Dialogue State Generation (Junfan Chen, EMNLP 2020)
Efficient Context and Schema Fusion Networks for Multi-Domain Dialogue State Tracking (Su Zhu, EMNLP 2020 Findings)
GCDST: A Graph-based and Copy-augmented Multi-domain Dialogue State Tracking (Peng Wu, EMNLP 2020 Findings)
Non-Autoregressive Dialog State Tracking (Hung Le, ICLR 2020, code)
Dialogue Act Recognition
Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech (Andreas Stolcke, CL 2000)
Dialogue Act Classification in Domain-Independent Conversations Using a Deep Recurrent Neural Network (Hamed Khanpour, COLING 2016)
Multi-level Gated Recurrent Neural Network for Dialog Act Classification (Wei Li, COLING 2016)
Neural-based Context Representation Learning for Dialog Act Classification (Daniel Ortega, SIGDIAL 2017)
Using Context Information for Dialog Act Classification in DNN Framework (Yang Liu, EMNLP 2017)
A Hierarchical Neural Model for Learning Sequences of Dialogue Acts (Quan Hung Tran, EACL 2017)
Dialogue Act Recognition via CRF-Attentive Structured Network (Zheqian Chen, SIGIR 2018)
Dialogue Act Sequence Labeling using Hierarchical encoder with CRF (Harshit Kumar, AAAI 2018, code)
A Context-based Approach for Dialogue Act Recognition using Simple Recurrent Neural Networks (Chandrakant Bothe, LREC 2018)
Conversational Analysis using Utterance-level Attention-based Bidirectional Recurrent Neural Networks (Chandrakant Bothe, INTERSPEECH 2018)
A Dual-Attention Hierarchical Recurrent Neural Network for Dialogue Act Classification (Ruizhe Li, CONLL 2019)
Dialogue Act Classification with Context-Aware Self-Attention (Vipul Raheja, NAACL 2019)
Modeling Long-Range Context for Concurrent Dialogue Acts Recognition (Yue Yu, CIKM 2019)
Towards Emotion-aided Multi-modal Dialogue Act Classification (Tulika Saha, ACL 2020)
Integrating User History into Heterogeneous Graph for Dialogue Act Recognition (Dong Wang, COLING 2020)
Dialogue Emotion Recognition
Toward Detecting Emotions in Spoken Dialogs (Chul Min Lee, 2005)
Real-Life Emotions Detection with Lexical and Paralinguistic Cues on Human-Human Call Center Dialogs (Laurence Devillers, 2006)
Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos (Devamanyu Hazarika, NAACL 2018)
ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection (Devamanyu Hazarika, EMNLP 2018, note)
DialogueRNN: An Attentive RNN for Emotion Detection in Conversations (Navonil Majumder, AAAI 2019, note)
HiGRU: Hierarchical Gated Recurrent Units for Utterance-level Emotion Recognition (Wenxiang Jiao, NAACL 2019)
Modeling both Context- and Speaker-Sensitive Dependence for Emotion Detection in Multi-speaker Conversations (Dong Zhang, IJCAI 2019)
DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation (Deepanway Ghosal, EMNLP 2019)
Knowledge-Enriched Transformer for Emotion Detection in Textual Conversations (Peixiang Zhong, EMNLP 2019, code, note)
Dialogue Summarization
Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization (Guokan Shang, ACL 2018)
Abstractive Dialogue Summarization with Sentence-Gated Modeling Optimized by Dialogue Acts (Chih-Wen Goo SLT 2018)
Abstractive Meeting Summarization via Hierarchical Adaptive Segmental Network Learning (Zhou Zhao, WWW 2019)
Automatic Dialogue Summary Generation for Customer Service (Chunyi Liu, KDD 2019)
Topic-aware Pointer-Generator Networks for Summarizing Spoken Conversations (Zhengyuan Liu, ASRU 2019)
Keep Meeting Summaries on Topic: Abstractive Multi-Modal Meeting Summarization (Manling Li, ACL 2019)
A Hierarchical Network for Abstractive Meeting Summarization with Cross-Domain Pretraining (Chenguang Zhu, EMNLP 2020, code)
How Domain Terminology Affects Meeting Summarization Performance (Jia Jin Koay, COLING 2020)
Task-Oriented Dialogue System
Mem2Seq: Effectively Incorporating Knowledge Bases into End-to-End Task-Oriented Dialog Systems (Andrea Madotto, ACL 2018)
Sequicity: Simplifying Task-oriented Dialogue Systems with Single Sequence-to-Sequence Architectures (Wenqiang Lei, ACL 2018, code, note)
Multi-level Memory for Task Oriented Dialogs (Revanth Reddy, NAACL 2019, note)
A Working Memory Model for Task-oriented Dialog Response Generation (Xiuyi Chen, ACL 2019, note)
Global-to-local Memory Pointer Networks for Task-Oriented Dialogue (Chien-Sheng Wu, ICLR 2019, code, note)
Entity-Consistent End-to-end Task-Oriented Dialogue System with KB Retriever (Libo Qin, EMNLP 2019, code)
Hello, It’s GPT-2 – How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems (Paweł Budzianowski, 2019)
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue (Chien-Sheng Wu, EMNLP 2020)
MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems (Zhaojiang Lin, EMNLP 2020)
Learning Knowledge Bases with Parameters for Task-Oriented Dialogue Systems (Andrea Madotto, EMNLP 2020 Findings)
Dialogue Modeling and Generation
Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems (Tsung-Hsien Wen, EMNLP 2015)
Building End-to-End Dialogue Systems Using Generative Hierarchical Neural Network Models (Iulian V.Serban, AAAI 2016)
A Diversity-Promoting Objective Function for Neural Conversation Models (Jiwei Li, NAACL 2016, note)
How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation (Chia-Wei Liu, EMNLP 2016)
Deep Reinforcement Learning for Dialogue Generation (Jiwei Li, 2016, note)
A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues (Iulian Serban, AAAI 2017, code)
Mechanism-Aware Neural Machine for Dialogue Response Generation (Ganbin Zhou, AAAI 2017)
A Conditional Variational Framework for Dialog Generation (Xiaoyu Shen, ACL 2017)
Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders (Tiancheng Zhao, ACL 2017, note)
Generating High-Quality and Informative Conversation Responses with Sequence-to-Sequence Models (Louis Shao, EMNLP 2017)
Improving Variational Encoder-Decoders in Dialogue Generation (Xiaoyu Shen, AAAI 2018)
RUBER: An Unsupervised Method for Automatic Evaluation of Open-Domain Dialog Systems (Chongyang Tao, AAAI 2018)
Hierarchical Variational Memory Network for Dialogue Generation (Hongshen Chen, WWW 2018, code, note)
Variational Autoregressive Decoder for Neural Response Generation (Jiachen Du, EMNLP 2018)
Explicit State Tracking with Semi-Supervisionfor Neural Dialogue Generation (Xisen Jin, CIKM 2018, code, note)
Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization (Yizhe Zhang, NeurIPS 2018)
Jointly Optimizing Diversity and Relevance in Neural Response Generation (Xiang Gao, NAACL 2019, code)
Domain Adaptive Dialog Generation via Meta Learning (Kun Qian, ACL 2019, code, note)
Pretraining Methods for Dialog Context Representation Learning (Shikib Mehri, ACL 2019, note)
Incremental Transformer with Deliberation Decoder for Document Grounded Conversations (Zekang Li, ACL 2019)
Improving Neural Conversational Models with Entropy-Based Data Filtering (Richard Csaky, ACL 2019, code)
ReCoSa: Detecting the Relevant Contexts with Self-Attention for Multi-turn Dialogue Generation (Hainan Zhang, ACL 2019, code, note)
Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention (Wenhu Chen, ACL 2019, code, note)
Hierarchical Prediction and Adversarial Learning For Conditional Response Generation (Yanran Li, TKDE 2020)
Hierarchical Reinforcement Learning for Open-Domain Dialog (Abdelrhman Saleh, AAAI 2020)
DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation (Yizhe Zhang, ACL 2020, code)
PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable (Siqi Bao, ACL 2020)
Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks (Yiping Song, ACL 2020, code)
Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation (Jun Xu, ACL 2020)
Group-wise Contrastive Learning for Neural Dialogue Generation (Hengyi Cai, EMNLP 2020)
Plug-and-Play Conversational Models (Andrea Madotto, EMNLP 2020 Findings)
An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation (Piji Li, 2020)
The Adapter-Bot: All-In-One Controllable Conversational Model (Andrea Madotto, 2020)
Variational Transformers for Diverse Response Generation (Zhaojiang Lin, 2020)
Diversifying Dialog Generation via Adaptive Label Smoothing (Yida Wang, ACL 2021)
Stylized Response Generation
Polite Dialogue Generation Without Parallel Data (Tong Niu, TACL 2018)
Structuring Latent Spaces for Stylized Response Generation (Xiang Gao, EMNLP 2019, code)
Stylized Dialogue Response Generation Using Stylized Unpaired Texts (Yinhe Zheng, AAAI 2021)
Empathetic Dialogue Generation
Predicting and Eliciting Addressee’s Emotion in Online Dialogue (Takayuki Hasegawa, ACL 2013)
Large-scale Analysis of Counseling Conversations: An Application of Natural Language Processing to Mental Health (Tim Althoff, TACL 2016, code)
Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory (Hao Zhou, AAAI 2018, code)
Eliciting Positive Emotion through Affect-Sensitive Dialogue Response Generation: A Neural Network Approach (Nurul Lubis, AAAI 2018)
Automatic Dialogue Generation with Expressed Emotions (Chenyang Huang, NAACL 2018)
A Syntactically Constrained Bidirectional-Asynchronous Approach for Emotional Conversation Generation (Jingyuan Li, EMNLP 2018)
MOJITALK: Generating Emotional Responses at Scale (Xianda Zhou, EMNLP 2018)
Affective Neural Response Generation (Nabiha Asghar, ECIR 2018)
Topic-Enhanced Emotional Conversation Generation with Attention Mechanism (Yehong Peng, KBS 2019)
Positive Emotion Elicitation in Chat-Based Dialogue Systems (Nurul Lubis, TASLP 2019)
An Affect-Rich Neural Conversational Model with Biased Attention and Weighted Cross-Entropy Loss (Peixiang Zhong, AAAI 2019, code)
Affect-Driven Dialog Generation (Pierre Colombo, NAACL 2019)
Generating Responses with a Specific Emotion in Dialog (Zhenqiao Song, ACL 2019, note)
Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset (Hannah Rashkin, ACL 2019)
MoEL: Mixture of Empathetic Listeners (Zhaojiang Lin, EMNLP 2019, code)
CAiRE: An End-to-End Empathetic Chatbot (Zhaojiang Lin, AAAI 2020)
What If Bots Feel Moods? (Lisong Qiu, SIGIR 2020)
EmoElicitor: An Open Domain Response Generation Model with User Emotional Reaction Awareness (Shifeng Li, IJCAI 2020)
CDL: Curriculum Dual Learning for Emotion-Controllable Response Generation (Lei Shen, ACL 2020)
Balancing Objectives in Counseling Conversations: Advancing Forwards or Looking Backwards (Justine Zhang, ACL 2020)
A Computational Approach to Understanding Empathy Expressed in Text-Based Mental Health Support (Ashish Sharma, EMNLP 2020)
MIME: MIMicking Emotions for Empathetic Response Generation (Navonil Majumder, EMNLP 2020)
Towards Empathetic Dialogue Generation over Multi-type Knowledge (Qintong Li, 2020)
EmpDG: Multiresolution Interactive Empathetic Dialogue Generation (Qintong Li, COLING 2020)
Empathetic Response Generation through Graph-based Multi-hop Reasoning on Emotional Causality (Jiashuo Wang, KBS 2021)
Dual-View Conditional Variational Auto-Encoder for Emotional Dialogue Generation (Mei Li, TALLIP 2021)
Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach (Ashish Sharma, WWW 2021)
Towards Emotional Support Dialog Systems (Siyang Liu, ACL 2021, code, note)
CoMAE: A Multi-factor Hierarchical Framework for Empathetic Response Generation (Chujie Zheng, ACL 2021 Findings)
Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes (Hyunwoo Kim, EMNLP 2021)
Constructing Emotion Consensus and Utilizing Unpaired Data for Empathetic Dialogue Generation (Lei Shen, EMNLP 2021 Findings)
Improving Empathetic Response Generation by Recognizing Emotion Cause in Conversations (Jun Gao, EMNLP 2021 Findings)
Emotion Eliciting Machine: Emotion Eliciting Conversation Generation based on Dual Generator (Hao Jiang, 2021)
EmpBot: A T5-based Empathetic Chatbot focusing on Sentiments (Emmanouil Zaranis, 2021)
CEM: Commonsense-aware Empathetic Response Generation (Sahand Sabour, AAAI 2022)
Persona-Based Dialogue System
A Persona-Based Neural Conversation Model (Jiwei Li, ACL 2016, code)
Personalizing Dialogue Agents: I have a dog, do you have pets too? (Saizheng Zhang, ACL 2018)
Exploiting Persona Information for Diverse Generation of Conversational Responses (Haoyu Song, IJCAI 2019, code)
Personalizing Dialogue Agents via Meta-Learning (Andrea Madotto, ACL 2019)
Generating Persona Consistent Dialogues by Exploiting Natural Language Inference (Haoyu Song, AAAI 2020)
A Neural Topical Expansion Framework for Unstructured Persona-oriented Dialogue Generation (Minghong Xu, ECAI 2020)
Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation (Hanyu Song, ACL 2020)
You Impress Me: Dialogue Generation via Mutual Persona Perception (Qian Liu, ACL 2020, code)
Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions (Bodhisattwa Prasad Majumder, EMNLP 2020)
Towards Persona-Based Empathetic Conversational Model (Peixiang Zhong, EMNLP 2020, code)
Knowledge-Grounded Dialogue System
Incorporating Loose-Structured Knowledge into LSTM with Recall Gate for Conversation Modeling (IJCNN 2017, note)
A Knowledge-Grounded Neural Conversation Model (Marjan Ghazvininejad, AAAI 2018, code)
Augmenting End-to-End Dialogue Systems With Commonsense Knowledge (Tom Young, AAAI 2018)
Commonsense Knowledge Aware Conversation Generation with Graph Attention (Hao Zhou, IJCAI 2018)
Knowledge Diffusion for Neural Dialogue Generation (Shuman Liu, ACL 2018, note)
Learning to Select Knowledge for Response Generation in Dialog Systems (Rongzhong Lian, IJCAI 2019)
Enhancing Conversational Dialogue Models with Grounded Knowledge (Wen Zheng, CIKM 2019)
Knowledge Aware Conversation Generation with Explainable Reasoning over Augmented Graphs (Zhibin Liu, EMNLP 2019, code)
Thinking Globally, Acting Locally: Distantly Supervised Global-to-Local Knowledge Selection for Background Based Conversation (Pengjie Ren, AAAI 2020)
RefNet: A Reference-Aware Network for Background Based Conversation (Chuan Meng, AAAI 2020)
Low-Resource Knowledge-Grounded Dialogue Generation (Xueliang Zhao, ICLR 2020)
Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue (Byeongchang Kim, ICLR 2020, code)
TopicKA: Generating Commonsense Knowledge-Aware Dialogue Responses Towards the Recommended Topic Fact (Sixing Wu, IJCAI 2020)
Diverse and Informative Dialogue Generation with Context-Specific Commonsense Knowledge Awareness (Sixing Wu, ACL 2020)
Knowledge-Grounded Dialogue Generation with Pre-trained Language Models (Xueliang Zhao, EMNLP 2020)
Bridging the Gap between Prior and Posterior Knowledge Selection for Knowledge-Grounded Dialogue Generation (Xiuyi Chen, EMNLP 2020)
Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters (Yan Xu, 2021)
Conversational Recommender System
Towards Conversational Recommender Systems (Konstantina Christakopoulou, KDD 2016)
Conversational Recommender System (Yueming Sun, SIGIR 2018)
Towards Deep Conversational Recommendations (Raymond Li, NeurIPS 2018, note)
Towards Knowledge-Based Recommender Dialog System (Qibin Chen, EMNLP 2019, code, note)
Recommendation as a Communication Game: Self-Supervised Bot-Play for Goal-oriented Dialogue (Dongyeop Kang, EMNLP 2019)
Leveraging Historical Interaction Data for Improving Conversational Recommender System (Kun Zhou, CIKM 2020)
Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion (Kun Zhou, KDD 2020, note)
Towards Conversational Recommendation over Multi-Type Dialogs (Zeming Liu, ACL 2020, note)
Question Answering and Machine Reading Comprehension
Dataset
Mctest: A challenge dataset for the open-domainmachine comprehension of text (Matthew Richardson, EMNLP 2013)
WIKIQA: A Challenge Dataset for Open-Domain Question Answering (Yi Yang, EMNLP 2015)
SQuAD: 100,000+ Questions for Machine Comprehension of Text (Pranav Rajpurkar, 2016, code)
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset (Payal Bajaj, 2016, code, note)
DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications (Wei He, 2017, code)
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension (Mandar Joshi, ACL 2017, code)
RACE: Large-scale ReAding Comprehension Dataset From Examinations (Guokun Lai, 2017, code)
The NarrativeQA Reading Comprehension Challenge (Tomas Kocisky, TACL 2018, code, note)
Know What You Don’t Know: Unanswerable Questions for SQuAD (Pranav Rajpurkar, ACL 2018, code, note)
CoQA: A Conversational Question Answering Challenge (Siva Reddy, 2018, code)
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge (Peter Clark, 2018, code)
QuAC : Question Answering in Context (Eunsol Choi, EMNLP 2018, code, note)
A Dataset and Baselines for Sequential Open-Domain Question Answering (Ahmed Elgohary, EMNLP 2018, code)
Interpretation of Natural Language Rules in Conversational Machine Reading (Marzieh Saeidi, EMNLP 2018)
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering (Todor Mihaylov, EMNLP 2018)
DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs (Dheeru Dua, NAACL 2019, code)
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge (Alon Talmor, NAACL 2019, note)
Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning (Pradeep Dasigi, EMNLP 2019, code)
COSMOS QA: Machine Reading Comprehension with Contextual Commonsense Reasoning (Lifu Huang, EMNLP 2019)
SocialIQA: Commonsense Reasoning about Social Interactions (Maarten Sap, EMNLP 2019)
PIQA: Reasoning about Physical Commonsense in Natural Language (Yonatan Bisk, AAAI 2020)
CommonsenseQA 2.0: Exposing the Limits of AI through Gamification (Alon Talmor, 2021)
Machine Reading Comprehension
Teaching Machines to Read and Comprehend (Karl Moritz Hermann, NeurIPS 2015, code, note)
Text Understanding with the Attention Sum Reader Network (Rudolf Kadlec, ACL 2016, code, note)
ReasoNet: Learning to Stop Reading in Machine Comprehension (Yelong Shen, KDD 2017, note)
Machine Comprehension Using Match-LSTM and Answer Pointer (Shuohang Wang, ICLR 2017, code, note)
Bidirectional Attention Flow for Machine Comprehension (Minjoon Seo, ICLR 2017, code, note)
Attention-over-Attention Neural Networks for Reading Comprehension (Yiming Cui, ACL 2017, code, note)
Simple and Effective Multi-Paragraph Reading Comprehension (Christopher Clark, 2017, code, note)
Gated Self-Matching Networks for Reading Comprehension and Question Answering (Wenhui Wang, ACL 2017, code, note)
R-Net: Machine Reading Comprehension with Self-Matching Networks (Natural Language Computing Group, 2017, code, note)
QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension (Adams Wei Yu, ICLR 2018, code, note)
Multi-Granularity Hierarchical Attention Fusion Networks for Reading Comprehension and Question Answering (Wei Wang, ACL 2018, code, note)
Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification (Yizhong Wang, ACL 2018, note)
Joint Training of Candidate Extraction and Answer Selection for Reading Comprehension (Zhen Wang, ACL 2018, note)
Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge (Todor Mihaylov, ACL 2018)
Improving Machine Reading Comprehension with General Reading Strategies (Kai Sun, NAACL 2019, code, note)
SG-Net: Syntax-Guided Machine Reading Comprehension (Zhuosheng Zhang, AAAI 2020, code, note)
Retrospective Reader for Machine Reading Comprehension (Zhuosheng Zhang, 2020, note)
Answer Selection
LSTM-based Deep Learning Models for Non-factoid Answer Selection (Ming Tan, ICLR 2016, note)
Hierarchical Attention Flow for Multiple-Choice Reading Comprehension (Haichao Zhu, AAAI 2018)
A Co-Matching Model for Multi-choice Reading Comprehension (Shuohang Ming, ACL 2018)
Option Comparison Network for Multiple-choice Reading Comprehension (Qiu Ran, 2019, note)
Knowledge Based Question Answering
Information Extraction over Structured Data: Question Answering with Freebase (Xuchen Yao, ACL 2014, note)
Question Answering over Freebase with Multi-Column Convolutional Neural Networks (Li Dong, ACL 2015, code, note)
Question Answering on Freebase via Relation Extraction and Textual Evidence (Kun Xu, ACL 2016, note)
Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks (Rajarshi Das, ACL 2017, note)
Question Answering over Freebase via Attentive RNN with Similarity Matrix based CNN (Yingqi Qu, ISMC 2018, code, note)
Variational Reasoning for Question Answering with Knowledge Graph (Yuyu Zhang, AAAI 2018, note)
Retrieve, Program, Repeat: Complex Knowledge Base Question Answering via Alternate Meta-learning (Yuncheng Hua, IJCAI 2020)
Conversational Question Answering
SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering (Chenguang Zhu, 2018, code, note)
Dialog-to-Action: Conversational Question Answering Over a Large-Scale Knowledge Base (Daya Guo, NeurIPS 2018)
FlowQA: Grasping Flow in History for Conversational Machine Comprehension (Hsin-Yuan Huang, ICLR 2019, code, note)
BERT with History Answer Embedding for Conversational Question Answering (Chen Qu, SIGIR 2019, code)
Look before you Hop: Conversational Question Answering over Knowledge Graphs Using Judicious Context Expansion (Philipp Christmann, CIKM 2019)
Attentive History Selection for Conversational Question Answering (Chen Qu, CIKM 2019, code, note)
Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base (Tao Shen, EMNLP 2019)
Visual Question Answering
VQA: Visual Question Answering (Aishwarya Agrawal, ICCV 2015)
Hierarchical Question-Image Co-Attention for Visual Question Answering (Jiasen Lu, NeurIPS 2016)
Explicit Knowledge-based Reasoning for Visual Question Answering (Peng Wang, IJCAI 2017)
FVQA: Fact-based Visual Question Answering (Peng Wang, TPAMI 2018, note)
Straight to the Facts: Learning Knowledge Base Retrieval for Factual Visual Question Answering (Medhini Narasimhan, ECCV 2018)
Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering (Medhini Narasimhan, NeurIPS 2018)
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge (Kenneth Marino, CVPR 2019, code, note)
KnowIT VQA: Answering Knowledge-Based Questions about Videos (Noa Garcia, AAAI 2020)
BERT Representations for Video Question Answering (Zekun Yang, WACV 2020)
Knowledge Representation and Reasoning
Knowledge Base
DBpedia: A Nucleus for a Web of Open Data (Soren Auer, 2007, code)
Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge (Kurt Bollacker, 2008, code)
CN-DBpedia: A Never-Ending Chinese Knowledge Extraction System (Bo Xu, IEA-AIE 2017, code, note)
ConceptNet 5.5: An Open Multilingual Graph of General Knowledge (Robyn Speer, AAAI 2017, code)
ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning (Maarten Sap, AAAI 2019)
GenericsKB: A Knowledge Base of Generic Statements (Sumithra Bhakthavatsalam, 2020, code)
Knowledge Base Construction
COMET : Commonsense Transformers for Automatic Knowledge Graph Construction (Antoine Bosselut, ACL 2019)
Knowledge Graph Embedding and Completion
Translating Embeddings for Modeling Multi-relational Data (Antoine Bordes, NeurIPS 2013, code, note)
Knowledge Graph Embedding by Translating on Hyperplanes (Zhen Wang, AAAI 2014, code, note)
Learning Entity and Relation Embeddings for Knowledge Graph Completion (Yankai Lin, AAAI 2015, code, note)
Knowledge Graph Embedding via Dynamic Mapping Matrix (Guoliang Ji, ACL 2015, code, note)
TransA: An Adaptive Approach for Knowledge Graph Embedding (Han Xiao, 2015, note)
Modeling Relation Paths for Representation Learning of Knowledge Bases (Yankai Lin, EMNLP 2015, code, note)
TransG : A Generative Model for Knowledge Graph Embedding (Han Xiao, ACL 2016, code, note)
Knowledge Graph Completion with Adaptive Sparse Transfer Matrix (Guoliang Ji, AAAI 2016, code, note)
Knowledge Graph Embedding: A Survey of Approaches and Applications (Quan Wang, TKDE 2017, note)
Convolutional 2D Knowledge Graph Embeddings (Pasquale Minervini, AAAI 2018, code, note)
Open-World Knowledge Graph Completion (Baoxu Shi, AAAI 2018, code, note)
One-Shot Relational Learning for Knowledge Graphs (Wenhan Xiong, EMNLP 2018, code, note)
Entity Discovery and Linking
A Generative Entity-Mention Model for Linking Entities with Knowledge Base (Xianpei Han, ACL 2011, note)
Overview of TAC-KBP2014 Entity Discovery and Linking Tasks (Heng Ji, TAC 2014)
An Attentive Neural Architecture for Fine-grained Entity Type Classification (Sonse Shimaoka, 2016, note)
Neural Architectures for Fine-grained Entity Type Classification (Sonse Shimaoka, EACL 2017, code)
Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings (Abhishek Abhishek, EACL 2017, code)
Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss (Peng Xu, NAACL 2018, code)
Ultra-fine entity typing (Eunsol Choi, ACL 2018, note)
Entity Set Expansion
Web-Scale Distributional Similarity and Entity Set Expansion (Patrick Pantel, EMNLP 2009)
EgoSet: Exploiting Word Ego-networks and User-generated Ontology for Multifaceted Set Expansion (Xin Rong, WSDM 2016)
SetExpan: Corpus-Based Set Expansion via Context Feature Selection and Rank Ensemble (Jiaming Shen, ECML PKDD 2017)
HiExpan: Task-Guided Taxonomy Construction by Hierarchical Tree Expansion (Jiaming Shen, KDD 2018)
TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network (Jiaming Shen, WWW 2020)
Empower Entity Set Expansion via Language Model Probing (Yunyi Zhang, ACL 2020)
Causal Knowledge
Automatic Extraction of Causal Relations from Natural Language Texts: A Comprehensive Survey (Nabiha Asghar, 2016)
Guided Generation of Cause and Effect (Zhongyang Li, IJCAI 2020)
Knowledge Graph Application
Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing (Annervaz K M, NAACL 2018, note)
Entity-Duet Neural Ranking: Understanding the Role of Knowledge Graph Semantics in Neural Information Retrieval (Zhenghao Liu, ACL 2018, code, note)
Coreference Resolution
Deep Reinforcement Learning for Mention-Ranking Coreference Models (Kevin Clark, EMNLP 2016, code)
Improving Coreference Resolution by Learning Entity-Level Distributed Representations (Kevin Clark, ACL 2016, code, note)
Higher-order Coreference Resolution with Coarse-to-fine Inference (Kenton Lee, NAACL 2018, code, note)
Learning Word Representations with Cross-Sentence Dependency for End-to-End Co-reference Resolution (Hongyin Luo, EMNLP 2018)
BERT for Coreference Resolution: Baselines and Analysis (Mandar Joshi, EMNLP 2019, code, note)
GECOR: An End-to-End Generative Ellipsis and Co-reference Resolution Model for Task-Oriented Dialogue (Jun Quan, 2019, note)
Incorporating Structural Information for Better Coreference Resolution (Kong Fang, IJCAI 2019)
End-to-end Deep Reinforcement Learning Based Coreference Resolution (Hongliang Fei, ACL 2019)
The Referential Reader: A Recurrent Entity Network for Anaphora Resolution (Fei Liu, ACL 2019)
Pronoun Resolution
Commonsense Knowledge Enhanced Embeddings for Solving Pronoun Disambiguation Problems in Winograd Schema Challenge (Quan Liu, 2016, note)
WikiCREM: A Large Unsupervised Corpus for Coreference Resolution (Vid Kocijan, 2019)
Look Again at the Syntax: Relational Graph Convolutional Network for Gendered Ambiguous Pronoun Resolution (Yinchuan Xu, 2019, code)
Incorporating Context and External Knowledge for Pronoun Coreference Resolution (Hongming Zhang, NAACL 2019, code, note)
Knowledge-aware Pronoun Coreference Resolution (Hongming Zhang, ACL 2019, code, note)
What You See is What You Get: Visual Pronoun Coreference Resolution in Dialogues (Xintong Yu, EMNLP 2019, note)
Zero Pronoun Resolution
Identification and Resolution of Chinese Zero Pronouns: A Machine Learning Approach (Shanheng Zhao and Hwee Tou Ng, EMNLP 2007)
Chinese Zero Pronoun Resolution: A Joint Unsupervised Discourse-Aware Model Rivaling State-of-the-Art Resolvers (Chen Chen and Vincent Ng, ACL 2015)
Chinese Zero Pronoun Resolution with Deep Neural Networks (Chen Chen and Vincent Ng, ACL 2016)
Chinese Zero Pronoun Resolution with Deep Memory Network (Qingyu Yin, EMNLP 2017)
A Deep Neural Network for Chinese Zero Pronoun Resolution (Qingyu Yin, IJCAI 2017)
Generating and Exploiting Large-Scale Pseudo Training Data for Zero Pronoun Resolution (Ting Liu, ACL 2017, note)
Deep Reinforcement Learning for Chinese Zero pronoun Resolution (Qingyu Yin, ACL 2018, code, note)
Zero Pronoun Resolution with Attention-based Neural Network (Qingyu Yin, COLING 2018, code, note)
Hierarchical Attention Network with Pairwise Loss for Chinese Zero Pronoun Resolution (Peiqin Lin, AAAI 2020, code, note)
Natural Language Processing for Programming Language
code2seq: Generating sequences from structured representations of code (Uri Alon, ICLR 2019, code)
Evaluating Large Language Models Trained on Code (Mark Chen, 2021)
Code Comment Generation
Towards automatically generating summary comments for java methods (Giriprasad Sridhara, ASE 2010)
On automatically generating commit messages via summarization of source code changes (Luis Fernando Cortés-Coy, SCAM 2014)
Source code analysis extractive approach to generate textual summary (Kareem Abbas Dawood, 2017)
Code Retrieval
Deep code search (Xiaodong Gu, ICSE 2018)
Natural Language Generation
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks (Samy Bengio, NeurIPS 2015)
Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation (Albert Gatt, 2017)
Neural Text Generation: A Practical Guide (Ziang Xie, 2017)
A Hybrid Convolutional Variational Autoencoder for Text Generation (Stanislau Semeniuta, 2017)
Natural Language Generation by Hierarchical Decoding with Linguistic Patterns (Shang-Yu Su, NAACL 2018)
Topic-Guided Variational Autoencoders for Text Generation (Wenlin Wang, NAACL 2019)
Generating Long and Informative Reviews with Aspect-Aware Coarse-to-Fine Decoding (Junyi Li, ACL 2019)
Syntax-Infused Variational Autoencoder for Text Generation (Xinyuan Zhang, ACL 2019)
Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models (Dinghan Shan, ACL 2019)
Keeping Notes: Conditional Natural Language Generation with a Scratchpad Mechanism (Ryan Y. Benmalek, ACL 2019)
Scheduled Sampling for Transformers (Tsvetomila Mihaylova, ACL 2019)
Long and Diverse Text Generation with Planning-based Hierarchical Variational Model (Zhihong Shao, EMNLP 2019)
Neural Text Generation With Unlikelihood Training (Sean Welleck, 2019, note)
Best-First Beam Search (Clara Meister, TACL 2020, code, note)
The Curious Case of Neural Text Degeneration (Ari Holtzman, ICLR 2020, note)
Distilling Knowledge Learned in BERT for Text Generation (Yen-Chun Chen, ACL 2020)
Contrastive Learning with Adversarial Perturbations for Conditional Text Generation (Seanie Lee, ICLR 2021, code)
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics (Sebastian Gehrmann, 2021)
Neural Text Generation with Part-of-Speech Guided Softmax (Zhixian Yang, 2021)
Automatic Metric
Binary Codes Capable of Correcting Deletions, Insertions and Reversals (VI Levenshtein, 1966)
ROUGE: A Package for Automatic Evaluation of Summaries (Chin-Yew Lin, 2004)
∆BLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets (Michel Galley, ACL 2015)
Sentence Mover’s Similarity: Automatic Evaluation for Multi-Sentence Texts (Elizabeth Clark, ACL 2019)
MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance (Wei Zhao, EMNLP 2019)
BERTScore: Evaluating Text Generation with BERT (Tianyi Zhang, ICLR 2020)
BLEURT: Learning Robust Metrics for Text Generation (Thibault Sellam, ACL 2020)
Evaluation of Text Generation: A Survey (Asli Celikyilmaz, 2020)
Lower Perplexity is Not Always Human-Like (Tatsuki Kuribayashi, ACL 2021)
Sequence to Sequence
Sequence to Sequence Learning with Neural Networks (Ilya Sutskever, NeurIPS 2014)
Convolutional Sequence to Sequence Learning (Jonas Gehring, ICML 2017, code, note)
Deliberation Networks: Sequence Generation Beyond One-Pass Decoding (Yingce Xia, NeurIPS 2017)
Deep Reinforcement Learning For Sequence to Sequence Models (Yaser Keneshloo, 2018, code, note)
Von Mises-Fisher Loss for Training Sequence to Sequence Models with Continuous Outputs (Sachin Kumar, ICLR 2019, code)
MASS: Masked Sequence to Sequence Pre-training for Language Generation (Kaitao Song, 2019, code, note)
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (Mike Lewis, ACL 2020)
Graph to Sequence
Text Generation from Knowledge Graphs with Graph Transformers (Rik Koncel-Kedziorski, NAACL 2019, code, note)
Controlled Text Generation
Toward Controlled Generation of Text (Zhiting Hu, ICML 2017)
Improved Variational Autoencoders for Text Modeling using Dilated Convolutions (Zichao Yang, ICML 2017)
Adversarially Regularized Autoencoders (Jake Zhao, 2017)
T-CVAE: Transformer-Based Conditioned Variational Autoencoder for Story Completion (Tianming Wang, IJCAI 2019, note)
Complementary Auxiliary Classifiers for Label-Conditional Text Generation (Yuan Li, AAAI 2020)
Plug and Play Language Models: a Simple Approach to Controlled Text Generation (Sumanth Dathathri, ICLR 2020)
DEXPERTS: On-the-Fly Controlled Text Generation with Experts and Anti-Experts (Alisa Liu, ACL 2021)
AMR-to-Text Generation
Modeling Graph Structure in Transformer for Better AMR-to-Text Generation (Jie Zhu, EMNLP 2019)
Enhancing AMR-to-Text Generation with Dual Graph Representations (Leonardo F. R. Ribeiro, EMNLP 2019, code, note)
Line Graph Enhanced AMR-to-Text Generation with Mix-Order Graph Attention Networks (Yanbin Zhao, ACL 2020)
GPT-too: A language-model-first approach for AMR-to-text generation (Manuel Mager, ACL 2020, code)
Data-to-Text Generation
Data-to-Text Generation with Content Selection and Planning (Ratish Puduppully, AAAI 2019, code, note)
Data-to-text Generation with Entity Modeling (Ratish Puduppully, ACL 2019, code, note)
Table-to-Text Generation with Effective Hierarchical Encoder on Three Dimensions (Row, Column and Time) (Heng Gong, EMNLP 2019)
Paraphrase Generation
Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation (Ashutosh Kumar, NAACL 2019, code)
A Multi-Task Approach for Disentangling Syntax and Semantics in Sentence Representations (Mingda Chen, NAACL 2019, code)
Paraphrase Generation with Latent Bag of Words (Yao Fu, NeurIPS 2019)
Storytelling
Content Learning with Structure-Aware Writing: A Graph-Infused Dual Conditional Variational Autoencoder for Automatic Storytelling (Meng-Hsuan Yu, AAAI 2021)
Machine Translation
A Statistical Approach To Machine Translation (Peter F. Brown, CL 1990)
Statistical Phrase-Based Translation (Philipp Koehn, NAACL 2003)
Minimum Error Rate Training in Statistical Machine Translation (Franz Josef Och, ACL 2003)
Hierarchical Phrase-Based Translation (David Chiang, CL 2007)
Moses: Open Source Toolkit for Statistical Machine Translation (Philipp Koehn, ACL 2007)
Statistical Machine Translation (Philipp Koehn, 2010)
Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation (Kyunghyun Cho, EMNLP 2014, note)
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches (Kyunghyun Cho, 2014, note)
Improving Neural Machine Translation Models with Monolingual Data (Rico Sennrich, ACL 2016)
Modeling Coverage for Neural Machine Translation (Zhaopeng Tu, ACL 2016, note)
Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation (Yonghui Wu, 2016, note)
Neural Machine Translation with Reconstruction (Zhaopeng Tu, AAAI 2017)
Neural Machine Translation and Sequence-to-sequence Models: A Tutorial (Graham Neubig, 2017, note)
Six Challenges for Neural Machine Translation (Philipp Koehn, 2017)
Multi-channel Encoder for Neural Machine Translation (Hao Xiong, AAAI 2018, note)
Translating Pro-Drop Languages with Reconstruction Models (Longyue Wang, AAAI 2018)
Learning to Jointly Translate and Predict Dropped Pronouns with a Shared Reconstruction Mechanism (Longyue Wang, EMNLP 2018)
Mixed Multi-Head Self-Attention for Neural Machine Translation (Hongyi Cui, EMNLP 2019)
Simple, Scalable Adaptation for Neural Machine Translation (Ankur Bapna, EMNLP 2019, note)
Dynamically Composing Domain-Data Selection with Clean-Data Selection by “Co-Curricular Learning” for Neural Machine Translation (Wei Wang, ACL 2019)
Reducing Word Omission Errors in Neural Machine Translation: A Contrastive Learning Approach (Zonghan Yang, ACL 2019)
Bridging the Gap between Training and Inference for Neural Machine Translation (Wen Zhang, ACL 2019)
Neural Machine Translation: A Review and Survey (Felix Stahlberg, 2019)
Towards Making the Most of BERT in Neural Machine Translation (Jiacheng Yang, AAAI 2020)
Mirror-Generative Neural Machine Translation (Zaixiang Zheng, ICLR 2020)
Incorporating Bert into Neural Machine Translation (Jinhua Zhu, ICLR 2020)
Exploring Supervised and Unsupervised Rewards in Machine Translation (Julia Ive, EACL 2020)
Self-Induced Curriculum Learning in Self-Supervised Neural Machine Translation (Dana Ruiter, EMNLP 2020)
Counterfactual Data Augmentation for Neural Machine Translation (Qi Liu, NAACL 2021)
Can Latent Alignments Improve Autoregressive Machine Translation? (Adi Haviv, NAACL 2021)
Mixed Cross Entropy Loss for Neural Machine Translation (Haoran Li, ICML 2021)
Self-Training Sampling with Monolingual Data Uncertainty for Neural Machine Translation (Wenxiang Jiao, ACL 2021)
Neural Machine Translation with Monolingual Translation Memory (Deng Cai, ACL 2021)
On the Language Coverage Bias for Neural Machine Translation (Shuo Wang, ACL 2021 Findings)
Machine Translation Decoding beyond Beam Search (Remi Leblond, 2021)
Phrase-level Active Learning for Neural Machine Translation (Junjie Hu, 2021)
Dataset
Europarl: A Parallel Corpus for Statistical Machine Translation (Philipp Koehn, 2005, code)
The JRC-Acquis: A Multilingual Aligned Parallel Corpus with 20+ Languages (Ralf Steinberger, 2006, code)
Creating a Massively Parallel Bible Corpus (Thomas Mayer, 2014)
A Massively Parallel Corpus: the Bible in 100 Languages (Christos Christodouloupoulos, 2015)
The United Nations Parallel Corpus v1.0 (Michał Ziemski, LREC 2016, code)
JW300: A Wide-Coverage Parallel Corpus for Low-Resource Languages (Zeljko Agic, ACL 2019, code)
WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia (Holger Schwenk, 2019, code)
Ancient–Modern Chinese Translation with a New Large Training Dataset (Dayiheng Liu, 2020)
Automatic Metric
A New Quantitative Quality Measure for Machine Translation Systems (Keh-Yih Su, COLING 1992)
An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research (Sonja Nießen, LREC 2000)
Using Multiple Edit Distances to Automatically Rank Machine Translation Output (Yasuhiro Akiba, 2001)
BLEU: a Method for Automatic Evaluation of Machine Translation (Kishore Papineni, ACL 2002, note)
Automatic Evaluation of Machine Translation Quality using N-gram CoOccurrence Statistics (George R Doddington, 2002)
A Novel String-to-String Distance Measure with Applications to Machine Translation Evaluation (Gregor Leusch, 2003)
METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments (Satanjeev Banerjee, ACL 2005 Workshop)
Scientific Credibility of Machine Translation Research: A Meta-Evaluation of 769 Papers (Benjamin Marie, ACL 2021)
To Ship or Not to Ship: An Extensive Evaluation of Automatic Metrics for Machine Translation (Tom Kocmi, 2021)
Vocabulary Coverage
Learning Translation Models from Monolingual Continuous Representations (Kai Zhao, NAACL 2015)
Addressing the Rare Word Problem in Neural Machine Translation (Minh-Thang Luong, ACL 2015)
Neural Machine Translation of Rare Words with Subword Units (Rico Sennrich, ACL 2016, code, note)
Vocabulary Learning via Optimal Transport for Machine Translation (Jingjing Xu, ACL 2021, note)
Word Alignment
The Mathematics of Statistical Machine Translation: Parameter Estimation (Peter F. Brown, CL 1993, code)
HMM-Based Word Alignment in Statistical Translation (Stephan Vogel, COLING 1996)
Models of Translational Equivalence among Words (I. Dan Melamed, CL 2000)
Improved Statistical Alignment Models (Franz Josef Och, ACL 2000)
A Systematic Comparison of Various Statistical Alignment Models (Franz Josef Och, CL 2003)
Log-Linear Models for Word Alignment (Yang Liu, ACL 2005)
NeurAlign: Combining Word Alignments Using Neural Networks (Necip Fazil Ayan, 2005)
Alignment by Agreement (Percy Liang, NAACL 2006)
Measuring Word Alignment Quality for Statistical Machine Translation (Alexander Fraser, CL 2007)
Building a golden collection of parallel Multi-Language Word Alignment (João de Almeida Varelas Graça, LREC 2008, code)
Parallel Implementations of Word Alignment Tool (Qin Gao, 2008)
Unsupervised Word Alignment with Arbitrary Features (Chris Dyer, ACL 2011)
Bayesian Word Alignment for Statistical Machine Translation (Coşkun Mermer, ACL 2011)
Improving the IBM Alignment Models Using Variational Bayes (Darcey Riley, ACL 2012)
Improving Statistical Machine Translation Using Bayesian Word Alignment and Gibbs Sampling (Coşkun Mermer, TASLP 2013)
A Simple, Fast, and Effective Reparameterization of IBM Model 2 (Chris Dyer, NAACL 2013)
A Systematic Bayesian Treatment of the IBM Alignment Models (Yarin Gal, NAACL 2013)
Word Alignment Modeling with Context Dependent Deep Neural Network (Nan Yang, ACL 2013)
Recurrent Neural Networks for Word Alignment Model (Akihiro Tamura, ACL 2014)
Contrastive Unsupervised Word Alignment with Non-Local Features (Yang Liu, AAAI 2015)
Efficient Word Alignment with Markov Chain Monte Carlo (Robert Östling, 2016, code)
Neural Network-based Word Alignment through Score Aggregation (Joel Legrand, 2016)
On The Alignment Problem In Multi-Head Attention-Based Neural Machine Translation (Tamer Alkhouli, 2018)
Jointly Learning to Align and Translate with Transformer Models (Sarthak Garg, EMNLP 2019)
End-to-end Neural Word Alignment Outperforms GIZA++ (Thomas Zenkel, ACL 2020)
A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT (Masaaki Nagata, EMNLP 2020)
Accurate Word Alignment Induction from Neural Machine Translation (Yun Chen, EMNLP 2020)
SimAlign: High Quality Word Alignments Without Parallel Training Data Using Static and Contextualized Embeddings (Masoud Jalili Sabet, EMNLP 2020 Findings)
Neural Baselines for Word Alignment (Anh Khoa Ngo Ho, 2020)
Word Alignment by Fine-tuning Embeddings on Parallel Corpora (Zi-Yi Dou, EACL 2021)
MASK-ALIGN: Self-Supervised Neural Word Alignment (Chi Chen, ACL 2021, code)
A Bidirectional Transformer Based Alignment Model for Unsupervised Word Alignment (Jingyi Zhang, ACL 2021)
ParCourE: A Parallel Corpus Explorer for a Massively Multilingual Corpus (Ayyoob Imani, ACL 2021 demo)
Graph Algorithms for Multiparallel Word Alignment (Ayyoob Imani, EMNLP 2021)
Embedding-Enhanced Giza++: Improving Alignment in Low- and High-Resource Scenarios Using Embedding Space Geometry (Kelly Marchisio, 2021)
SLUA: A Super Lightweight Unsupervised Word Alignment Model via Cross-Lingual Contrastive Learning (Di Wu, 2021)
Sentence Alignment
Aligning Sentences in Parallel Corpora (Peter F. Brown, ACL 1991)
A Program for Aligning Sentences in Bilingual Corpora (William A. Gale, CL 1993)
Text-Translation Alignment (Martin Kay, 1993)
Using Cognates to Align Sentences in Bilingual Corpora (Michel Simard, 1993)
Bilingual Sentence Alignment: Balancing Robustness and Accuracy (Michel Simard, 1998)
Fast and Accurate Sentence Alignment of Bilingual Corpora (Robert C. Moore, 2002)
Aligning Parallel Bilingual Corpora Statistically with Punctuation Criteria (Thomas C. Chuang, 2005)
Segmentation and Alignment of Parallel Text for Statistical Machine Translation (Yonggang Deng, 2007)
Improved Sentence Alignment for Movie Subtitles (Jörg Tiedemann, 2007)
Linguistically-Based Sub-Sentential Alignment for Terminology Extraction from a Bilingual Automotive Corpus (Lieve Macken, COLING 2008)
Improved Unsupervised Sentence Alignment for Symmetrical and Asymmetrical Parallel Corpora (Fabienne Braune, COLING 2010)
Fast-Champollion: A Fast and Robust Sentence Alignment Algorithm (Peng Li, COLING 2010)
Iterative, MT-based Sentence Alignment of Parallel Texts (Rico Sennrich, 2011)
Yet Another Fast, Robust and Open Source Sentence Aligner. Time to Reconsider Sentence Alignment? (Fethi Lamraoui, 2013)
A Strong Baseline for Learning Cross-Lingual Word Embeddings from Sentence Alignments (Omer Levy, EACL 2017)
Overview of the Second BUCC Shared Task: Spotting Parallel Sentences in Comparable Corpora (Pierre Zweigenbaum, 2017)
Extracting Parallel Sentences with Bidirectional Recurrent Neural Networks to Improve Machine Translation (Francis Gregoire, COLING 2018)
Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings (Mikel Artetxe, ACL 2019)
Vecalign: Improved Sentence Alignment in Linear Time and Space (Brian Thompson, EMNLP 2019, code)
Parallel Sentence Mining by Constrained Decoding (Pinzhen Chen, ACL 2020)
SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP (Katsuki Chousa, COLING 2020)
Document Alignment
Findings of the WMT 2016 Bilingual Document Alignment Shared Task (Christian Buck, WMT 2016)
Unsupervised Machine Translation
Unsupervised Neural Machine Translation (Mikel Artetxe, ICLR 2018, code, note)
Unsupervised Machine Translation Using Monolingual Corpora Only (Guillaume Lample, ICLR 2018)
Phrase-Based & Neural Unsupervised Machine Translation (Guillaume Lample, EMNLP 2018, code, note)
An Effective Approach to Unsupervised Machine Translation (Mikel Artetxe, ACL 2019)
When and Why is Unsupervised Neural Machine Translation Useless? (Yunsu Kim, EAMT 2020)
When Does Unsupervised Machine Translation Work? (Kelly Marchisio, WMT 2020)
Non-Autoregressive Machine Translation
Non-Autoregressive Neural Machine Translation (Jiatao Gu, ICLR 2018, code, note)
Context-Aware Cross-Attention for Non-Autoregressive Translation (Liang Ding, COLING 2020)
Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation (Cunxiao Du, ICML 2021)
Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in Non-Autoregressive Translation (Liang Ding, ACL 2021)
Low-Resource Machine Translation
Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation (Melvin Johnson, TACL 2017)
Meta-Learning for Low-Resource Neural Machine Translation (Jiatao Gu, EMNLP 2018)
Rapid Adaptation of Neural Machine Translation to New Languages (Graham Neubig, EMNLP 2018)
Generalized Data Augmentation for Low-Resource Translation (Mengzhou Xia, ACL 2019, code)
Revisiting Low-Resource Neural Machine Translation: A Case Study (Rico Sennrich, ACL 2019)
Handling Syntactic Divergence in Low-resource Machine Translation (Chunting Zhou, EMNLP 2019)
The FLoRes Evaluation Datasets for Low-Resource Machine Translation: Nepali-English and Sinhala-English (Francisco Guzman, EMNLP 2019, code)
Cross-Lingual Pre-Training Based Transfer for Zero-Shot Neural Machine Translation (Baijun Ji, AAAI 2020)
Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation (Biao Zhang, ACL 2020)
Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages (Xavier Garcia, NAACL 2021)
Multilingual Machine Translation
Multilingual Neural Machine Translation with Knowledge Distillation (Xu Tan, ICLR 2019, code)
Multilingual Neural Machine Translation With Soft Decoupled Encoding (Xinyi Wang, ICLR 2019, code, note)
Massively Multilingual Neural Machine Translation (Roee Aharoni, NAACL 2019)
Target Conditioned Sampling: Optimizing Data Selection for Multilingual Neural Machine Translation (Xinyi Wang, ACL 2019)
Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies (Yunsu Kim, ACL 2019, code)
Multilingual Neural Machine Translation with Language Clustering (Xu Tan, EMNLP 2019)
Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages (Yunsu Kim, EMNLP 2019)
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges (Naveen Arivazhagan, 2019)
Multilingual Denoising Pre-training for Neural Machine Translation (Yinhan Liu, TACL 2020)
Balancing Training for Multilingual Neural Machine Translation (Xinyi Wang, ACL 2020)
Knowledge Distillation for Multilingual Unsupervised Neural Machine Translation (Haipeng Sun, ACL 2020)
Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation (Aditya Siddhant, ACL 2020)
Beyond English-Centric Multilingual Machine Translation (Angela Fan, 2020)
Explicit Alignment Objectives for Multilingual Bidirectional Encoders (Junjie Hu, NAACL 2021)
Lightweight Adapter Tuning for Multilingual Speech Translation (Hang Le, ACL 2021)
Contrastive Learning for Many-to-many Multilingual Neural Machine Translation (Xiao Pan, ACL 2021)
Multi-Domain Machine Translation
Multi-Domain Neural Machine Translation with Word-Level Domain Context Discrimination (Jiali Zeng, EMNLP 2018)
A Survey of Domain Adaptation for Neural Machine Translation (Chenhui Chu, COLING 2018)
Improving Domain Adaptation Translation with Domain Invariant and Specific Information (Shuhao Gu, NAACL 2019)
Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation (Brian Thompson, NAACL 2019)
Domain Adaptation of Neural Machine Translation by Lexicon Induction (Junjie Hu, ACL 2019)
Iterative Dual Domain Adaptation for Neural Machine Translation (EMNLP 2019)
Go From the General to the Particular: Multi-Domain Translation with Domain Transformation Networks (Yong Wang, AAAI 2020)
Learning a Multi-Domain Curriculum for Neural Machine Translation (Wei Wang, ACL 2020)
Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing (Haoming Jiang, ACL 2020)
Distilling Multiple Domains for Neural Machine Translation (Anna Currey, EMNLP 2020)
Exploring Discriminative Word-Level Domain Contexts for Multi-Domain Neural Machine Translation (Jinsong Su, TPAMI 2021)
Finding Sparse Structures for Domain Specific Neural Machine Translation (Jianze Liang, AAAI 2021)
Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation (Shuhao Gu, NAACL 2021)
Multi-Modal Translation
Distilling Translations with Visual Awareness (Julia Ive, ACL 2019)
Latent Variable Model for Multi-modal Translation (Iacer Calixto, ACL 2019)
Multimodal Transformer for Multimodal Machine Translation (Shaowei Yao, ACL 2020)
Tree-Based Machine Translation
Sequence-to-Dependency Neural Machine Translation (Shuangzhi Wu, ACL 2017, note)
Context-Aware Machine Translation
Neural Machine Translation with Extended Context (Jorg Tiedemann, EMNLP 2017 Workshop)
Context-Aware Neural Machine Translation Learns Anaphora Resolution (Elena Voita, ACL 2018)
When a Good Translation is Wrong in Context: Context-Aware Machine Translation Improves on Deixis, Ellipsis, and Lexical Cohesion (Elena Voita, ACL 2019)
Improving Context-Aware Neural Machine Translation with Source-side Monolingual Documents (Linqing Chen, IJCAI 2021)
Measuring and Increasing Context Usage in Context-Aware Machine Translation (Patrick Fernandes, ACL 2021)
Simultaneous Machine Translation
Monotonic Multihead Attention (Xutai Ma, 2019)
Learning Adaptive Segmentation Policy for Simultaneous Translation (Ruiqing Zhang, EMNLP 2020)
Interpretability
Does String-Based Neural MT Learn Source Syntax? (Xing Shi, EMNLP 2016)
Robustness
Synthetic and Natural Noise Both Break Neural Machine Translation (Yonatan Belinkov, ICLR 2018)
Towards Robust Neural Machine Translation (Yong Cheng, ACL 2018)
Robust Neural Machine Translation with Doubly Adversarial Inputs (Yong Cheng, ACL 2019)
Evaluating Robustness to Input Perturbations for Neural Machine Translation (Xing Liu, ACL 2020)
Multilinguality
Language Identification
Word Level Language Identification in Online Multilingual Communication (Dong Nguyen, EMNLP 2013)
Automatic Language Identification in Texts: A Survey (Tommi Jauhiainen, JAIR 2019)
Code Switching
Challenges of Computational Processing of Code-Switching (Özlem Çetinoglu, 2016)
Code-switched Language Models Using Dual RNNs and Same-Source Pretraining (Saurabh Garg, EMNLP 2018)
Language Modeling for Code-Switching: Evaluation, Integration of Monolingual Data, and Discriminative Training (Hila Gonen, EMNLP 2019)
Modeling Code-Switch Languages Using Bilingual Parallel Corpus (Grandee Lee, ACL 2020)
A Survey of Code-switched Speech and Language Processing (Sunayana Sitaram, 2020)
Language Diversity and Evolution
A Statistical Model for Lost Language Deciphermen (Benjamin Snyder, ACL 2010)
Comparing Language Similarity across Genetic and Typologically-Based Groupings (Ryan Georgi, COLING 2010)
Unsupervised Transcription of Historical Documents (Taylor Berg-Kirkpatrick, ACL 2013)
The World Atlas of Language Structures Online (Matthew S. Dryer, 2013, code)
Unsupervised Code-Switching for Multilingual Historical Document Transcription (Dan Garrette, NAACL 2015)
An Unsupervised Model of Orthographic Variation for Historical Document Transcription (Dan Garrette, NAACL 2016)
Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change (William L. Hamilton, ACL 2016, code)
Dynamic Word Embeddings (Robert Bamler, ICML 2017)
Learning Language Representations for Typology Prediction (Chaitanya Malaviya, EMNLP 2017)
The CLIN27 Shared Task: Translating Historical Text to Contemporary Language for Improving Automatic Linguistic Annotation (Erik Tjong Kim Sang, 2017)
Cultural Shift or Linguistic Drift? Comparing Two Computational Measures of Semantic Change (William L. Hamilton, 2017)
Diachronic Word Embeddings and Semantic Shifts: A Survey (Andrey Kutuzov, COLING 2018)
Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes (Nikhil Garg, 2018)
Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing (Edoardo Maria Ponti, CL 2019)
A Large-Scale Comparison of Historical Text Normalization Systems (Marcel Bollmann, NAACL 2019)
Neural Decipherment via Minimum-Cost Flow: from Ugaritic to Linear B (Jiaming Luo, ACL 2019)
Restoring Ancient Text Using Deep Learning: A Case Study on Greek Epigraphy (Yannis Assael, EMNLP 2019)
The State and Fate of Linguistic Diversity and Inclusion in the NLP World (Pratik Joshi, ACL 2020)
Natural Language Processing for Similar Languages, Varieties, and Dialects: A Survey (Marcos Zampieri, 2020)
Learning to Recognize Dialect Features (Dorottya Demszky, NAACL 2021)
The Classical Language Toolkit: An NLP Framework for Pre-Modern Languages (Kyle P. Johnson, ACL 2021 Demo)
Multilingual Training/Cross-lingual Transfer
Exploiting Similarities among Languages for Machine Translation (Tomas Mikolov, 2013)
Learning Continuous Phrase Representations for Translation Modeling (Jianfeng Gao, ACL 2014)
Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation (Chao Xing, NAACL 2015)
Word Translation Without Parallel Data (Alexis Conneau, ICLR 2018)
On the Limitations of Unsupervised Bilingual Dictionary Induction (Anders Søgaard, ACL 2018)
Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion (Armand Joulin, EMNLP 2018)
Non-Adversarial Unsupervised Word Translation (Yedid Hoshen, EMNLP 2018, code)
A Survey of Cross-lingual Word Embedding Models (Sebastian Ruder, JAIR 2019)
How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions (Goran Glavaš, ACL 2019)
Choosing Transfer Languages for Cross-Lingual Learning (Yu-Hsiang Lin, ACL 2019, code)
Cross-lingual Language Model Pretraining (Alexis Conneau, NeurIPS 2019, code)
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation (Junjie Hu, ICML 2020)
Unsupervised Cross-lingual Representation Learning at Scale (Alexis Conneau, ACL 2020, code)
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer (Jonas Pfeiffer, EMNLP 2020)
Combining Word Embeddings with Bilingual Orthography Embeddings for Bilingual Dictionary Induction (COLING 2020)
Variational Information Bottleneck for Effective Low-Resource Fine-Tuning (Rabeeh Karimi Mahabadi, ICLR 2021)
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models (Phillip Rust, ACL 2021)
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters (Mengjie Zhao, ACL 2021)
MuRIL: Multilingual Representations for Indian Languages (Simran Khanuja, 2021)
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA (Zewen Chi, 2021)
When Word Embeddings Become Endangered (Khalid Alnajjar, 2021)
Multilingual/Cross-lingual Application
Named Entity Transliteration Generation Leveraging Statistical Machine Translation Technology (Pradeep Dasigi, 2011)
Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings (Ivan Vulić, SIGIR 2015)
Creating a Translation Matrix of the Bible’s Names Across 591 Languages (Winston Wu, LREC 2018)
UniTrans : Unifying Model Transfer and Data Transfer for Cross-Lingual Named Entity Recognition with Unlabeled Data (Qianhui Wu, IJCAI 2020)
Cross-Lingual Named Entity Recognition Using Parallel Corpus: A New Approach Using XLM-RoBERTa Alignment (Bing Li, 2021)
Interpretability in Natural Language Processing
What you can cram into a single vector: Probing sentence embeddings for linguistic properties (Alexis Conneau, ACL 2018)
Linguistic Knowledge and Transferability of Contextual Representations (Nelson F. Liu, NAACL 2019)
What Does BERT Look At? An Analysis of BERT’s Attention (Kevin Clark, ACL 2019 Workshop)
Revealing the Dark Secrets of BERT (Olga Kovaleva, EMNLP 2019)
How Can We Know What Language Models Know? (Zhengbao Jiang, TACL 2020)
Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection (Hanjie Chen, ACL 2020)
AUTOPROMPT: Eliciting Knowledge from Language Models with Automatically Generated Prompts (Taylor Shin, 2020)
When Do You Need Billions of Words of Pretraining Data? (Yian Zhang, 2020)
SparseBERT: Rethinking the Importance Analysis in Self-attention (Han Shi, ICML 2021)
Superbizarre Is Not Superb: Derivational Morphology Improves BERT’s Interpretation of Complex Words (Valentin Hofmann, ACL 2021)
Fairness in Natural Language Processing
Learning Gender-Neutral Word Embeddings (Jieyu Zhao, EMNLP 2018)
StereoSet: Measuring Stereotypical Bias in Pretrained Language Models (Moin Nadeem, 2020)
Computer Vision
Feature Detector and Descriptor
Towards Automatic Visual Obstacle Avoidance (Hans P. Moravec, 1977)
A Computational Approach to Edge Detection (John Canny, TPAMI 1986)
A Combined Corner and Edge Detector (Chris Harris, 1988)
Scale-Space and Edge Detection Using Anisotropic Diffusion (Pietro Perona, TPAMI 1990)
SUSAN—A New Approach to Low Level Image Processing (Stephen M. Smith, 1997)
Object Recognition from Local Scale-Invariant Features (David G. Lowe, ICCV 1999)
Multiresolution Gray Scale and Rotation Invariant Texture Classification with Local Binary Patterns (Timo Ojala, TPAMI 2002)
Robust Wide Baseline Stereo from Maximally Stable Extremal Regions (Jiri Matas, BMVC 2002)
Image Registration Methods: A Survey (Barbara Zitova, 2003)
Distinctive Image Features from Scale-Invariant Keypoints (David G. Lowe, IJCV 2004)
A Comparison of Affine Region Detectors (Krystian Mikolajczyk, IJCV 2004)
A Performance Evaluation of Local Descriptors (Krystian Mikolajczyk, TPAMI 2005)
Histograms of Oriented Gradients for Human Detection (Navneet Dalal, CVPR 2005)
Machine Learning for High-Speed Corner Detection (Edward Rosten, ECCV 2006)
SURF: Speeded Up Robust Features (Herbert Bay, ECCV 2006)
CenSurE: Center Surround Extremas for Realtime Feature Detection and Matching (Motilal Agrawal, ECCV 2008)
Local Invariant Feature Detectors: A Survey (Tinne Tuytelaars, 2008)
Discriminative Learning of Local Image Descriptors (Matthew Brown, TPAMI 2010)
DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo (Engin Tola, TPAMI 2010)
ORB: An efficient alternative to SIFT or SURF (Ethan Rublee, ICCV 2011)
LIFT: Learned Invariant Feature Transform (Kwang Moo Yi, ECCV 2016)
Vision Representation
BEiT: BERT Pre-Training of Image Transformers (Hanbo Bao, 2021)
Learning Transferable Visual Models From Natural Language Supervision (Alec Radford, 2021)
Object Detection
Rapid object detection using a boosted cascade of simple features (CVPR 2001, Paul Viola)
Learning To Detect Unseen Object Classes by Between-Class Attribute Transfer (Christoph H. Lampert, CVPR 2009, code, note)
DeViSE: A Deep Visual-Semantic Embedding Model (Andrea Frome, NeurIPS 2013, note)
Rich feature hierarchies for accurate object detection and semantic segmentation (Ross Girshick, CVPR 2014, code, note)
Fast Region-based Convolutional Networks for object detection (Ross Girshick, ICCV 2015, code, note)
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks (Shaoqing Ren, NeurIPS 2015, code, note)
R-FCN: Object Detection via Region-based Fully Convolutional Networks (Jifeng Dai, NeurIPS 2016, code, note)
You Only Look Once: Unified, Real-Time Object Detection (Joseph Redmon, CVPR 2016, code, note)
SSD: Single Shot MultiBox Detector (Wei Liu, ECCV 2016, code, note)
YOLO9000: Better, Faster, Stronger (Joseph Redmon, CVPR 2017, code, note)
Mask R-CNN (Kaiming He, ICCV 2017, code, note)
YOLOv3: An Incremental Improvement (Joseph Redmon, 2018, code, note)
TensorMask: A Foundation for Dense Object Segmentation (Xinlei Chen, 2019, note)
YOLOv4: Optimal Speed and Accuracy of Object Detection (Alexey Bochkovskiy, 2020, code)
RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder (Cheng Chi, NeurIPS 2020)
EfficientDet: Scalable and Efficient Object Detection (Mingxing Tan, CVPR 2020)
Semantic Segmentation
Fully Convolutional Networks for Semantic Segmentation (Jonathan Long, CVPR 2015, code, note)
Dual Attention Network for Scene Segmentation (Jun Fu, CVPR 2019)
Image Super-Resolution
Image Super-Resolution Using Deep Convolutional Networks (Chao Dong, TPAMI 2015)
Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network (Wenzhe Shi, CVPR 2016, code, note)
Accurate Image Super-Resolution Using Very Deep Convolutional Networks (Jiwon Kim, CVPR 2016, code, note)
Accelerating the Super-Resolution Convolutional Neural Network (Chao Dong, ECCV 2016, code, note)
Perceptual Losses for Real-Time Style Transfer and Super-Resolution (Justin Johnson, ECCV 2016, code, note)
Image Restoration Using Convolutional Auto-encoders with Symmetric Skip Connections (Xiao-Jiao Mao, 2016, code, note)
Image Super-Resolution Using Dense Skip Connections (Tong Tong, ICCV 2017, code, note)
Image Super-Resolution via Deep Recursive Residual Network (Tai Yang, CVPR 2017, code, note)
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network (Christian Ledig, CVPR 2017, code, note)
Pixel Recursive Super Resolution (Ryan Dahl, 2017, code)
Deep Back-Projection Networks for Super-Resolution (Muhammad Haris, CVPR 2018, code, note)
Person Re-identification
Joint Discriminative and Generative Learning for Person Re-identification (Zhedong Zheng, CVPR 2019, code)
Machine Learning
Decision Tree
Classification and Regression Trees (L. Breiman, 1984, code1, code2, note)
Induction of Decision Trees (J. Ross Quinlan, 1986, code, note)
C4.5: Programs for Machine Learning (J. Ross Quinlan, 1993, code, note)
Support Vector Machine
A Training Algorithm for Optimal Margin Classifiers (Bernhard E Boser, 1992, code)
Support-Vector Networks (Corinna Cortes, 1995, code)
Estimating the Support of a High-Dimensional Distribution (Bernhard Schölkopf, 1999, code)
New Support Vector Algorithms (Bernhard Schölkopf, 2000, code)
Conditional Random Field
An Introduction to Conditional Random Fields (Charles Sutton, 2010, code, note)
Expectation Maximization
Maximum likelihood from incomplete data via the EM algorithm (Arthur Dempster, 1977, note)
The EM Algorithm and Extensions (Geoff McLachlan, 1997)
Ensemble Method
Greedy function approximation: a gradient boosting machine (Jerome H. Friedman, 2001, code, note)
XGBoost: A Scalable Tree Boosting System (Tianqi Chen, SIGKDD 2016, code, note)
Learning Theory
On the Uniform Convergence of Relative Frequencies of Events to their Probabilities (Vladimir Vapnik, 1971)
A Theory of the Learnable (Leslie Valiant, 1984)
Occam’s Razor (Anselm Blumer, 1987, note)
Imbalanced Data
SMOTE: Synthetic Minority Over-sampling Technique (Nitesh V. Chawla, 2002, code, note)
kNN approach to unbalanced data distributions: A case study involving information extraction (Jianping Zhang, 2003, code)
Balancing Training Data for Automated Annotation of Keywords: a Case Study (Gustavo E. A. P. A. Batista, 2003, code)
A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data (Ronaldo C. Prati, 2004, code)
Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning (Hui Han, 2005, note)
ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning (Haibo He, 2008, code, note)
Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling Technique for Handling the Class Imbalanced Problem (Chumphol Bunkhumpornpat, 2009)
Learning from Imbalanced Data (Haibo He, 2009, note)
Multi-Task Learning
Multiple Kernel Learning, Conic Duality, and the SMO Algorithm (Francis R. Bach, ICML 2004)
Large Scale Multiple Kernel Learning (Sören Sonnenburg, JMLR 2006)
Factorized Latent Spaces with Structured Sparsity (Yangqing Jia, NeurIPS 2010)
Factorized Orthogonal Latent Spaces (Mathieu Salzmann, 2010)
Domain Separation Networks (Konstantinos Bousmalis, NeurIPS 2016, note)
Multi-Task Deep Neural Networks for Natural Language Understanding (Xiaodong Liu, ACL 2019, code, note)
Deep Learning
Artificial Neural Network
A logical Calculus of Ideas Immanent in Nervous Activity (Warren McCulloch, 1943)
The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain (Frank Rosenblatt, 1958)
Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms (Frank Rosenblatt, 1961)
Phoneme Recognition Using Time-Delay Neural Networks (Alexander Waibel, 1989)
Convolutional Neural Network
Receptive Fields, Binocular Interaction and Functional Architecture in the Cat’s Visual Cortex (David Hunter Hubel, 1962)
Backpropagation Applied to Handwritten Zip Code Recognition (Yann LeCun, 1989, note)
Gradient-Based Learning Applied to Document Recognition (Yann LeCun, 1998, note)
Notes on Convolutional Neural Networks (Jake Bouvrie, 2006, note)
ImageNet Classification with Deep Convolutional Neural Networks (Alex Krizhevsky, NeurIPS 2012, note)
Simplifying ConvNets for Fast Learning (Franck Mamalet, 2012)
Visualizing and Understanding Convolutional Networks (Matthew D. Zeiler, ECCV 2014, note)
Rigid-Motion Scattering for Texture Classification (Laurent Sifre, 2014)
Going Deeper with Convolutions (Christian Szegedy, CVPR 2015, note)
Very Deep Convolutional Networks for Large-Scale Image Recognition (Karen Simonyan, ICLR 2015, note)
Highway Networks (Rupesh Kumar Srivastava, 2015, note)
Multi-Scale Context Aggregation by Dilated Convolutions (Fisher Yu, ICLR 2016, code, note)
Deep Residual Learning for Image Recognition (Kaiming He, CVPR 2016, note)
Rethinking the Inception Architecture for Computer Vision (Christian Szegedy, CVPR 2016, code, note)
Deep Networks with Stochastic Depth (Gao Huang, ECCV 2016)
Resnet in Resnet: Generalizing Residual Architectures (Sasha Targ, ICLR 2016 Workshop)
Wide Residual Networks (Sergey Zagoruyko, BMVC 2016)
Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning (Christian Szegedy, AAAI 2017)
Densely Connected Convolutional Networks (Gao Huang, CVPR 2017, code)
Aggregated Residual Transformations for Deep Neural Networks (Saining Xie, CVPR 2017)
Xception: Deep Learning with Depthwise Separable Convolutions (Francois Chollet, CVPR 2017)
Deep Roots: Improving CNN Efficiency with Hierarchical Filter Groups (Yani Ioannou, CVPR 2017)
Factorized Convolutional Neural Networks (Min Wang, ICCV 2017)
Deformable Convolutional Networks (Jifeng Dai, ICCV 2017)
Convolution with Logarithmic Filter Groups for Efficient Shallow CNN (Tae Kwan Lee, 2017)
Squeeze-and-Excitation Networks (Jie Hu, CVPR 2018)
Tree-CNN: A Deep Convolutional Neural Network for Lifelong Learning (Deboleena Roy, 2018, code, note)
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (Mingxing Tan, ICML 2019, code, note)
HBONet: Harmonious Bottleneck on Two Orthogonal Dimensions (Duo Li, ICCV 2019, code)
EfficientNetV2: Smaller Models and Faster Training (Mingxing Tan, 2021)
Recurrent Neural Network
Long Short-Term memory (Sepp Hochreiter, 1997)
Recurrent Neural Network Regularization (Wojciech Zaremba, ICLR 2015)
A Critical Review of Recurrent Neural Networks for Sequence Learning (Zachary C. Lipton, 2015, note)
Adaptive Computation Time for Recurrent Neural Networks (Alex Graves, 2016, note)
Regularizing and Optimizing LSTM Language Models (Stephen Merity, 2017, note)
Simple Recurrent Units for Highly Parallelizable Recurrence (Tao Lei, EMNLP 2018, code)
Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks (Victor Campos, ICLR 2018, note)
Generative Adversarial Networks
Generative Adversarial Nets (Ian Goodfellow, 2014)
Conditional Generative Adversarial Nets (Mehdi Mirza, 2014, note)
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (Alec Radford, ICLR 2016, note)
NeurIPS 2016 Tutorial: Generative Adversarial Networks (Ian Goodfellow, NeurIPS 2016, note)
Conditional Image Synthesis with Auxiliary Classifier GANs (Augustus Odena, 2016, note)
Improved Techniques for Training GANs (Tim Salimans, 2016, note)
Generative Adversarial Networks: An Overview (Antonia Creswell, 2017)
How Generative Adversarial Nets and its variants Work: An Overview of GAN (Yongjun Hong, 2017)
Image-to-Image Translation with Conditional Adversarial Networks (Phillip Isola, CVPR 2017, code, note)
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks (Han Zhang, ICCV 2017, code, note)
Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (Jun-Yan Zhu, ICCV 2017, code, note)
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs (Ting-Chun Wang, CVPR 2018, code, note)
Learning from Simulated and Unsupervised Images through Adversarial Training (Ashish Shrivastava, ICCV 2017, code, note)
Wasserstein GAN (Martin Arjovsky, 2017, code, note)
Progressive Growing of GANs for Improved Quality, Stability, and Variation (Tero Karras, ICLR 2018, code, note)
Transferring GANs: Generating Images from Limited Data (Yaxing Wang, ECCV 2018, code, note)
Self-Supervised GANs via Auxiliary Rotation Loss (Ting Chen, CVPR 2019, code, note)
A Style-Based Generator Architecture for Generative Adversarial Networks (Tero Karras, CVPR 2019, code, note)
Large Scale GAN Training for High Fidelity Natural Image Synthesis (Andrew Brock, ICLR 2019, note)
Self-Attention Generative Adversarial Networks (Han Zhang, ICML 2019, code, note)
Analyzing and Improving the Image Quality of StyleGAN (Tero Karras, CVPR 2020, note)
Generative Adversarial Transformers (Drew A. Hudson, ICML 2021)
Autoencoder
Auto-Association by Multilayer Perceptrons and Singular Value Decomposition (Herve Bourland, 1988)
Reducing the Dimensionality of Data with Neural Networks (Geoffrey Hinton, 2006)
Extracting and Composing Robust Features with Denoising Autoencoders (Pascal Vincent, ICML 2008)
Contractive Auto-Encoders: Explicit Invariance During Feature Extraction (Salah Rifai, ICML 2011)
Sparse Autoencoder (Andrew Ng, 2011)
Variational Autoencoder
Auto-Encoding Variational Bayes (Diederik P. Kingma, ICLR 2014)
Variational Inference with Normalizing Flows (Danilo Jimenez Rezende, ICML 2015, note)
Learning Structured Output Representation using Deep Conditional Generative Models (Kihyuk Sohn, NeurIPS 2015, note)
Improved Variational Inference with Inverse Autoregressive Flow (Diederik P. Kingma, NeurIPS 2016)
Variational Graph AutoEncoders (Thomas N. Kipf, NeurIPS 2016 Workshop)
beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework (Irina Higgins, ICLR 2017)
Deep Variational Information Bottleneck (Alexander A. Alemi, ICLR 2017)
Variational Lossy Autoencoder (Xi Chen, ICLR 2017)
Neural Discrete Representation Learning (Aaron van den Oord, NeurIPS 2017)
Adversarially Regularized Autoencoders (Junbo Zhao, ICML 2018)
Disentangling by Factorising (Hyunjik Kim, ICML 2018)
Adversarially Regularized Graph Autoencoder for Graph Embedding (Shirui Pan, IJCAI 2018)
Isolating Sources of Disentanglement in VAEs (Ricky T. Q. Chen, NeurIPS 2018)
Learning Disentangled Joint Continuous and Discrete Representations (Emilien Dupont, NeurIPS 2018)
GraphVAE: Towards Generation of Small Graphs Using Variational Autoencoders (Martin Simonovsky, 2018)
Structured Disentangled Representations (Babak Esmaeili, AISTATS 2019)
Generating Diverse High-Fidelity Images with VQ-VAE-2 (Ali Razavi, NeurIPS 2019)
Variational Autoencoders and Nonlinear ICA: A Unifying Framework (Ilyes Khemakhem, AISTATS 2020)
From Variational to Deterministic Autoencoders (Partha Ghosh, ICLR 2020)
Graph Neural Network
The Graph Neural Network Model (Franco Scarselli, 2009, note)
Semi-Supervised Classification with Graph Convolutional Networks (Thomas N. Kipf, ICLR 2017, code, note)
Graph Attention Networks (Petar Veličković, ICLR 2018, code, note)
A Comprehensive Survey on Graph Neural Networks (Zonghan Wu, TNNLS 2019)
Graph Neural Networks for Natural Language Processing: A Survey (Lingfei Wu, 2021)
Capsule Network
Dynamic Routing Between Capsules (Geoffrey Hinton, NeurIPS 2017, code, note)
Matrix Capsules with EM Routing (Geoffrey Hinton, ICLR 2018)
Attention Mechanism
Neural Machine Translation by Jointly Learning to Align and Translate (Dzmitry Bahdanau, ICLR 2015, note)
DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding (Tao Shen, 2017, code, note)
Learning What’s Easy: Fully Differentiable Neural Easy-First Taggers (Andre F. T. Martins, EMNLP 2017)
Structured Attention Networks (Yoon Kim, ICLR 2017)
You May Not Need Attention (Ofir Press, 2018, code, note)
An Introductory Survey on Attention Mechanisms in NLP Problems (Dichao Hu, 2018, note)
Memory Network
Neural Turing Machines (Alex Graves, 2014, code, note)
Memory Networks (Jason Weston, ICLR 2015, note)
End-To-End Memory Networks (Sainbayar Sukhbaatar, 2015, note)
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing (Ankit Kumar, 2015, note)
Dynamic Memory Networks for Visual and Textual Question Answering (Caiming Xiong, 2016, code, note)
Gated End-to-End Memory Networks (Fei Liu, 2016, note)
Transformer
Attention is All You Need (Ashish Vaswani, 2017, code1, code2, code3, note)
Self-Attention with Relative Position Representations (Peter Shaw, 2018, note)
Input Combination Strategies for Multi-Source Transformer Decoder (Jindrich Libovicky, WMT 2018)
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned (Elena Voita, ACL 2019)
Universal Transformer (Mostafa Dehghani, ICLR 2019, code, note)
Adaptive Attention Span in Transformers (Sainbayar Sukhbaatar, ACL 2019)
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context (Zihang Dai, ACL 2019, note)
Adaptively Sparse Transformers (Goncalo M. Correia, EMNLP 2019)
Self-Attention with Structural Position Representations (Xing Wang, EMNLP 2019)
Tree Transformer: Integrating Tree Structures into Self-Attention (Yau-Shian Wang, EMNLP 2019)
Star-Transformer (Qipeng Guo, 2019, note)
Reformer: The Efficient Transformer (Nikita Kitaev, ICLR 2020)
How Does Selective Mechanism Improve Self-Attention Networks? (Xinwei Geng, ACL 2020)
The Unstoppable Rise of Computational Linguistics in Deep Learning (James Henderson, ACL 2020)
Self-Attention with Cross-Lingual Position Representation (Liang Ding, ACL 2020)
O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers (Chulhee Yun, NeurIPS 2020)
ETC: Encoding Long and Structured Data in Transformers (Joshua Ainslie, EMNLP 2020)
On the Sub-layer Functionalities of Transformer Decoder (Yilin Yang, EMNLP 2020 Findings)
Longformer: The Long-Document Transformer (Iz Beltagy, 2020, note)
Linformer: Self-Attention with Linear Complexity (Sinong Wang, 2020)
Multi-Head Attention: Collaborate Instead of Concatenate (Jean-Baptiste Cordonnier, 2020, code, note)
Efficient Content-Based Sparse Attention with Routing Transformers (Aurko Roy, TACL 2021)
Synthesizer: Rethinking Self-Attention for Transformer Models (Yi Tay, ICML 2021, note)
Relative Positional Encoding for Transformers with Linear Complexity (Antoine Liutkus, ICML 2021)
The Case for Translation-Invariant Self-Attention in Transformer-Based Language Models (Ulme Wennberg, ACL 2021)
Not All Attention Is All You Need (Hongqiu Wu, 2021)
Going Beyond Linear Transformers with Recurrent Fast Weight Programmers (Kazuki Irie, 2021)
Sparse Attention
From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification (Andre F. T. Martins, ICML 2016, code, note)
A Regularized Framework for Sparse and Structured Neural Attention (Vlad Niculae, NeurIPS 2017)
Sparse and Constrained Attention for Neural Machine Translation (Chaitanya Malaviya, ACL 2018)
On Controllable Sparse Alternatives to Softmax (Anirban Laha, NeurIPS 2018)
Sparse Sequence-to-Sequence Models (Ben Peters, ACL 2019, note)
Is Sparse Attention more Interpretable? (Clara Meister, ACL 2021)
Sparse Attention with Linear Units (Biao Zhang, 2021)
Optimization
Learning representations by back-propagating errors (David E. Rumelhart, 1986)
On the Momentum Term in Gradient Descent Learning (Ning Qian, 1999)
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization (John Duchi, JMLR 2011)
Sequential Model-Based Optimization for General Algorithm Configuration (Frank Hutter, LION 2011)
ADADELTA: An Adaptive Learning Rate Method (Matthew D. Zeiler, 2012, note)
ADAM: A Method for Stochastic Optimization (Diederik P. Kingma, 2015, note)
An overview of gradient descent optimization algorithms (Sebastian Ruder, 2017, note)
Weight Initialization
Understanding the difficulty of training deep feedforward neural networks (Xavier Glorot, JMLR 2010, note)
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification (Kaiming He, ICCV 2015, note)
Loss Function
FaceNet: A Unified Embedding for Face Recognition and Clustering (Florian Schroff, CVPR 2015, code, note)
A Discriminative Feature Learning Approach for Deep Face Recognition (Yandong Wen, ECCV 2016, code, note)
An exploration of softmax alternatives belonging to the spherical loss family (Alexandre de Brebisson, ICLR 2016)
Large-Margin Softmax Loss for Convolutional Neural Networks (Weiyang Liu, ICML 2016, code, note)
Focal Loss for Dense Object Detection (Tsung-Yi Lin, ICCV 2017, code, note)
SphereFace: Deep Hypersphere Embedding for Face Recognition (Weiyang Liu, ICML 2017, code, note)
CosFace: Large Margin Cosine Loss for Deep Face Recognition (Hao Wang, 2018, code, note)
ArcFace: Additive Angular Margin Loss for Deep Face Recognition (Jiankang Deng, 2018, code, note)
Additive Margin Softmax for Face Verification (Feng Wang, 2018, code, note)
DropMax: Adaptive Variational Softmax (Hae Beom Lee, NeurIPS 2018)
Activation Function
Rectified Linear Units Improve Restricted Boltzmann Machines (Vinod Nair, ICML 2010)
Empirical Evaluation of Rectified Activations in Convolution Network (Bing Xu, 2015)
Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units (Wenling Shang, ICML 2016, note)
Gaussian Error Linear Units (GELUs) (Dan Hendrycks, 2016, code, note)
Categorical Reparameterization with Gumbel-Softmax (Eric Jang, ICLR 2017, note)
Normalization
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (Sergey Ioffe, ICML 2015, note)
Layer Normalization (Jimmy Lei Ba, 2016, note)
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks (Tim Salimans, 2016)
Instance Normalization: The Missing Ingredient for Fast Stylization (Dmitry Ulyanov, 2017)
Group Normalization (Yuxin Wu, 2018, note)
Regularization
Just Train Twice: Improving Group Robustness without Training Group Information (Evan Zheran Liu, ICML 2021)
R-Drop: Regularized Dropout for Neural Networks (Xiaobo Liang, 2021)
Visualization
Visualizing Data using t-SNE (Laurens van der Maaten, JMLR 2008)
Reinforcement Learning
A Brief Survey of Deep Reinforcement Learning (Kai Arulkumaran, 2017, note)
Deep Reinforcement Learning: An Overview (Yuxi Li, 2017, note)
Reinforcement Learning Application
Playing Atari with Deep Reinforcement Learning (Volodymyr Mnih, 2013, note)
Mastering the Game of Go with Deep Neural Networks and Tree Search (David Silver, Nature 2016)
Mastering the Game of Go without Human Knowledge (David Silver, Nature 2017, code, note)
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm (David Silver, 2017)
DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning (Daochen Zha, ICML 2021)
Self-Supervised Learning
Dimensionality Reduction by Learning an Invariant Mapping (Raia Hadsell, CVPR 2006)
Pixel Recurrent Neural Networks (Aaron van den Oord, ICML 2016)
Conditional Image Generation with PixelCNN Decoders (Aaron van den Oord, NeurIPS 2016)
Improved Deep Metric Learning with Multi-class N-pair Loss Objective (Kihyuk Sohn, NeurIPS 2016)
Representation Learning with Contrastive Predictive Coding (Aaron van den Oord, 2018)
Learning Deep Representations by Mutual Information Estimation and Maximization (R Devon Hjelm, ICLR 2019)
Learning Representations by Maximizing Mutual Information Across Views (Philip Bachman, 2019, code)
Contrastive Multiview Coding (Yonglong Tian, 2019)
Momentum Contrast for Unsupervised Visual Representation Learning (Kaiming He, CVPR 2020)
A Simple Framework for Contrastive Learning of Visual Representations (Ting Chen, ICML 2020)
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments (Mathilde Caron, NeurIPS 2020)
Improved Baselines with Momentum Contrastive Learning (Xinlei Chen, 2020)
Exploring Simple Siamese Representation Learning (Xinlei Chen, 2020, note)
Bootstrap Your Own Latent A New Approach to Self-Supervised Learning (Jean-Bastien Gril, 2020, code)
Self-supervised Learning: Generative or Contrastive (Xiao Liu, TKDE 2021)
Barlow Twins: Self-Supervised Learning via Redundancy Reduction (Jure Zbontar, ICML 2021, code)
Understanding self-supervised Learning Dynamics without Contrastive Pairs (Yuandong Tian, ICML 2021)
An Empirical Study of Training Self-Supervised Vision Transformers (Xinlei Chen, 2021)
Incremental/Continual/Lifelong Learning
iCaRL: Incremental Classifier and Representation Learning (Sylvestre-Alvise Rebuffi, CVPR 2017)
Gradient Episodic Memory for Continual Learning (David Lopez-Paz, NeurIPS 2017)
Continual Learning with Deep Generative Replay (Hanul Shin, NeurIPS 2017)
End-to-End Incremental Learning (Francisco M. Castro, ECCV 2018)
Lifelong Machine Learning (Zhiyuan Chen, 2018)
Large Scale Incremental Learning (Yue Wu, CVPR 2019)
Learning a Unified Classifier Incrementally via Rebalancing (Saihui Hou, CVPR 2019)
Continual Lifelong Learning with Neural Networks: A Review (German I. Parisi, 2019)
Mnemonics Training: Multi-Class Incremental Learning without Forgetting (Yaoyao Liu, CVPR 2020)
Zero-Shot Learning
Zero-Shot Learning with Semantic Output Codes (Mark Palatucci, NeurIPS 2009, note)
Label-Embedding for Attribute-Based Classification (Zeynep Akata, CVPR 2013, note)
Zero-Shot Recognition using Dual Visual-Semantic Mapping Paths (Yanan Li, CVPR 2017, note)
Few-Shot Learning
Learning from one example through shared densities on transform (Erik G Miller, CVPR 2000)
One-Shot Learning of Object Categories (Li Fei-Fei, PAMI 2006, note)
Siamese neural networks for one-shot image recognition (Gregory Koch, ICML 2015 Workshop, note)
Matching Networks for One Shot Learning (Oriol Vinyals, NeurIPS 2016, note)
Meta Learning
Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks (Chelsea Finn, ICML 2017, code, note)
Meta-Learning: A Survey (Joaquin Vanschoren, 2018)
Meta-Learning Representations for Continual Learning (Khurram Javed, NeurIPS 2019, code)
Learning to Continually Learn (Shawn Beaulieu, ECAI 2020, code)
Curriculum Learning
Curriculum Learning (Yoshua Bengio, ICML 2009, note)
Self-Paced Curriculum Learning (Lu Jiang, AAAI 2015)
Automated Curriculum Learning for Neural Networks (Alex Graves, ICML 2017)
Federated Learning
Federated Machine Learning: Concept and Applications (Qiang Yang, 2019)
Advances and Open Problems in Federated Learning (Peter Kairouz, 2019)
Towards Federated Learning at Scale: System Design (Keith Bonawitz, 2019)
SecureBoost: A Lossless Federated Learning Framework (Kewei Cheng, 2019)
Personalized Federated Learning with Theoretical Guarantees: A Model-Agnostic Meta-Learning Approach (Alireza Fallah, NeurIPS 2020)
Cluster-driven Graph Federated Learning over Multiple Domains (Debora Caldarola, CVPR 2021)
Model Compression and Acceleration
A Survey of Model Compression and Acceleration for Deep Neural Networks (Yu Cheng, 2017)
Parameter Pruning and Quantization
Rethinking the Value of Network Pruning (Zhuang Liu, ICLR 2019, code, note)
Human-Designed Model
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications (Andrew Howard, 2017)
MobileNetV2: Inverted Residuals and Linear Bottlenecks (Mark Sandler, CVPR 2018, code)
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile (Xiangyu Zhang, CVPR 2018)
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design (Ningning Ma, ECCV 2018)
GhostNet: More Features from Cheap Operations (Kai Han, CVPR 2020)
Knowledge Distillation
Distilling the Knowledge in a Neural Network (Geoffrey Hinton, NeurIPS 2015, note)
Designing Network Design Spaces (Ilija Radosavovic, CVPR 2020)
Knowledge Distillation: A Survey (Jianping Gou, IJCV 2021)
Does Knowledge Distillation Really Work? (Samuel Stanton, 2021)
Neural Architecture Search
Neural Architecture Search with Reinforcement Learning (Barret Zoph, ICLR 2017, note)
Learning Transferable Architectures for Scalable Image Recognition (Barret Zoph, CVPR 2018, note)
Progressive Neural Architecture Search (Chenxi Liu, ECCV 2018, note)
NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications (Tien-Ju Yang, ECCV 2018, code, note)
Efficient Neural Architecture Search via Parameter Sharing (Hieu Pham, ICML 2018, note)
Regularized Evolution for Image Classifier Architecture Search (Esteban Real, AAAI 2019, note)
DARTS: Differentiable Architecture Search (Hanxiao Liu, ICLR 2019, code, note)
ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware (Han Cai, ICLR 2019, code, note)
MnasNet: Platform-Aware Neural Architecture Search for Mobile (Mingxing Tan, CVPR 2019, code, note)
Searching for MobileNetV3 (Andrew Howard, ICCV 2019, note)
Neural Architecture Search: A Survey (Thomas Elsken, JMLR 2019, note)
Neural Architecture Design for GPU-Efficient Networks (Ming Lin, 2020, code, note)
Interpretability
A Survey on Neural Network Interpretability (Yu Zhang, 2020)
Evidential Deep Learning and Uncertainty
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning (Yarin Gal, ICML 2016, note)
A Theoretically Grounded Application of Dropout in Recurrent Neural Networks (Yarin Gal, NeurIPS 2016)
What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? (Alex Kendall, NeurIPS 2017, note)
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles (Balaji Lakshminarayanan, NeurIPS 2017)
Evidential Deep Learning to Quantify Classification Uncertainty (Murat Sensoy, NeurIPS 2018)
Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift (Yaniv Ovadia, NeurIPS 2019)
Recommendation System
Deep Learning based Recommender System: A Survey and New Perspectives (Shuai Zhang, 2017, code, note)
A review on deep learning for recommender systems: challenges and remedies (Zeynep Batmaz, 2018, note)
News Recommendation
Google News Personalization: Scalable Online Collaborative Filtering (Abhinandan Das, WWW 2007, note)
Personalized News Recommendation Based on Click Behavior (Jiahui Liu, 2010, note)
Personalized Recommendation on Dynamic Content Using Predictive Bilinear Models (Wei Chu, WWW 2009, note)
A Contextual-Bandit Approach to Personalized News Article Recommendation (Lihong Li, WWW 2010, note)
A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems (Ali Elkahky, WWW 2015, note)
Click-Through Rate
Factorization Machines (Steffen Rendle, ICDM 2010, note)
Higher-Order Factorization Machine (Mathieu Blondel, NeurIPS 2016)
Field-aware Factorization Machines for CTR Prediction (Yu-Chin Juan, RecSys 2016, note)
DeepFM: A Factorization-Machine based Neural Network for CTR Prediction (Huifeng Guo, IJCAI 2017, note)
Deep Interest Network for Click-Through Rate Prediction (Guorui Zhou, KDD 2018, note)
Deep Interest Evolution Network for Click-Through Rate Prediction (Guorui Zhou, AAAI 2019, code, note)
Representation Learning-Assisted Click-Through Rate Prediction (Wentao Ouyang, IJCAI 2019, note)
Deep Spatio-Temporal Neural Networks for Click-Through Rate Prediction (Wentao Ouyang, KDD 2019, code, note)
Feature Generation by Convolutional Neural Network for Click-Through Rate Prediction (Bin Liu, WWW 2019, note)
Interpretable Click-Through Rate Prediction through Hierarchical Attention (Zeyu Li, WSDM 2020)
User Behavior Retrieval for Click-Through Rate Prediction (Jiarui Qin, SIGIR 2020)
Deep Time-Stream Framework for Click-through Rate Prediction by Tracking Interest Evolution (Shu-Ting Shi, AAAI 2020)
Deep Match to Rank Model for Personalized Click-Through Rate Prediction (Zequn Lyu, AAAI 2020, code, note)
Deep Multi-Interest Network for Click-through Rate Prediction (Zhibo Xiao, CIKM 2020)
Big Data
The Google File System (Sanjay Ghemawat, SOSP 2003, note)
MapReduce: Simplified Data Processing on Large Clusters (Jeffrey Dean, OSDI 2004, note)
Bigtable: A Distributed Storage System for Structured Data (Fay Chang, OSDI 2006, note)
Dynamo: Amazon’s Highly Available Key-Value Store (Giuseppe DeCandia, SIGOPS 2007)
Cassandra: A Decentralized Structured Storage System (Avinash Lakshman, SIGOPS 2010)
Tool
FudanNLP: A Toolkit for Chinese Natural Language Processing (Xipeng Qiu, ACL 2013, code)
LIBSVM: A library for support vector machines (Chih-Jen Lin, 2011, code, note)
HemI: A Toolkit for Illustrating Heatmaps (Wankun Deng, 2014, code)