精华内容
下载资源
问答
  • 深度学习论文

    2018-07-16 21:46:41
    深度学习论文 0. 深度学习的“圣经” 提到入门级的书,就不得不提这一本 Bengio Yoshua,Ian J. Goodfellow 和 Aaron Courville共同撰写的《深度学习》(Deep Learning)。 “这本关于深度学习的教课书是一本为了...

    深度学习论文

    0. 深度学习的“圣经”

    提到入门级的书,就不得不提这一本 Bengio Yoshua,Ian J. Goodfellow 和 Aaron Courville共同撰写的《深度学习》(Deep Learning)。

    “这本关于深度学习的教课书是一本为了帮助学生及从业者入门机器学习,并专注于深度学习领域的教材。”值得一提的是,这本 MIT 出版的“书”数年来一直在网上实时更新和完善,不断补充研究成果和新的参考文献,也向公众开放评论,接受修改意见,其火爆程度甚至被誉为深度学习的“圣经”。 目前该书可在亚马逊预定,今年年底就会送到你手上。

    《深度学习》阅读网址:http://www.deeplearningbook.org/

    1. 调研

    Yann LeCun , Yoshua Bengio和Geoffrey Hinton被作者誉为深度学习界三大天王,他们所发布在 Nature上的“Deep Learning”包含了大量的研究和调查,五星推荐,值得一读!

    [1] http://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf

    2. 建立深度学习的知识网

    作为 AI 领袖级人物,Geoffrey Hinton 目前就职于谷歌,而其与E., Simon Osindero和Yee-Whye The的代表作《A fast learning algorithm for deep belief nets》更是被奉为圭臬,不妨看看。

    [2] http://www.cs.toronto.edu/~hinton/absps/ncfast.pdf

    此外,他还有一篇署名第一作者的《Reducing the dimensionality of data with neural networks》,可以说是深度学习的里程碑之作。

    [3] http://www.cs.toronto.edu/~hinton/science.pdf

    3. ImageNet 革命

    当你读完了上面的几篇论文,相信你对深度学习也有了一个大致的了解。那么深度学习的突破点在哪呢?在 2012 年,Krizhevsky 的《Imagenet classification with deep convolutional neural networks》预示着神经网络的出现和发展有了突破性的研究进展。来不及了,赶紧上车吧,推荐指数五颗星。

    [4] http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

    而深度对于网络有多重要?《Very deep convolutional networks for large-scale image recognition》是牛津大学视觉几何组(VGG)Karen Simonyan 和 Andrew Zisserman 于 2014 年撰写的论文,主要探讨了深度对于网络的重要性;并建立了一个 19层的深度网络并获得了很好的结果。该论文在 ILSVRC上定位第一,分类第二。

    [5] https://arxiv.org/pdf/1409.1556.pdf

    如果想要了解下神经网络结构是如何改进的,那一定得读下这篇。Szegedy 和 Christian 都是当代著名的计算机科学家,他们曾在 2015 年合写了《Going deeper with convolutions》,这篇论文是为 ImageNet2014 的比赛而作,论文中的方法获得了比赛的第一名,包括 task1 分类任务和 task2 检测任务。本文主要关注针对计算机视觉的高效深度神经网络结构,通过改进神经网络的结构达到不增加计算资源需求的前提下提高网络的深度,从而达到提高效果的目的。

    [6] http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

    在第六届 ImageNet 年度图像识别测试中,微软研究院的计算机图像识别系统在几个类别的测试中拔得头筹,击败了谷歌、英特尔、高通、腾讯以及一些创业公司和学术实验室的系统。微软的获胜系统名为“图像识别的深度残差学习”(Deep Residual Learning for Image Recognition),由微软研究员何恺明、张祥雨、任少卿和孙剑组成的团队开发。因此,记录这一团队系统开发心得的《Deep Residual Learning for Image Recognition》绝对是学习必备啊,五星推荐。

    [7] https://arxiv.org/pdf/1512.03385.pdf

    4. 语音识别大法好

    Hinton 与 Geoffrey 等技术专家合著的《Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups》是语音识别领域的巨大突破。它融合了四个小组利用深度神经网络和声学建模完成语音识别的实例。

    [8] http://cs224d.stanford.edu/papers/maas_paper.pdf

    除了上面的几篇论文,Geoffrey Hinton 大神 在《Speech recognition with deep recurrent neural networks》一文中也是思如泉涌,他向我们介绍了深度循环神经网络(RNNs)在语音识别中的重要性。

    [9] https://arxiv.org/pdf/1303.5778.pdf

    想必我们对语音输入并不陌生,但这是如何实现的呢?这篇名为《Towards End-To-End Speech Recognition with Recurrent Neural Networks》由 Graves、Alex 和多伦多大学教授 Navdeep Jaitly 共同撰写。它向我们描述了一个无需中继语音重构的音频转文字识别系统。

    [10] http://www.jmlr.org/proceedings/papers/v32/graves14.pdf

    如果你要问谷歌语音识别系统之源是什么,那我一定会向你推荐这篇名为《Fast and accurate recurrent neural network acoustic models for speech recognition》的论文由 Sak 和 Hasim 等多位专家撰写而成,它是谷歌语音识别系统的重要理论基础之一。

    [11] https://arxiv.org/pdf/1507.06947.pdf

    百度近日公布了其硅谷人工智能实验室(SVAIL)的一项新的研究成果,被称为 Deep Speech 2。Deep Speech 通过使用一个单一的学习算法实现了准确识别英语和汉语的能力。这一成果就发表在论文《Deep speech 2: End-to-end speech recognition in english and mandarin》之中。

    [12] https://arxiv.org/pdf/1512.02595.pdf

    本月 18 日,微软人工智能与研究部门的研究员和工程师发表了一篇名为《Achieving Human Parity in Conversational Speech Recognition》的论文。论文表明,微软的对话语音识别技术在产业标准 Switchboard 语音识别基准测试中实现了词错率(word error rate, 简称WER)低至 5.9% 的好成绩,首次达成与人类专业速记员持平,并且要优于绝大多数人的表现。雷锋网此前也有提及,详情可点击原文查看。同时,也刷新了自己的一个月前创造的 6.3% 的记录。微软首席语音科学家黄学东是这一研究的参与者之一。

    [13] https://arxiv.org/pdf/1610.05256v1.pdf

    读完了上面推荐的论文,你一定对深度学习的历史有了一个基本了解,其基本的模型架构(CNN/RNN/LSTM)与深度学习如何应用在图片和语音识别上肯定也不在话下了。下一部分,我们将通过新一批论文,让你对深度学习的方式与深度学习在不同领域的运用有个清晰的了解。由于第二部分的论文开始向细化方向延展,因此你可以根据自己的研究方向酌情进行选择。

     

    1.深度学习模型

    Hinton 与 Geoffrey 等技术专家合著的《Improving neural networks by preventing co-adaptation of feature detectors》也很有指导意义。论文提出,在训练神经网络模型时,如果训练样本较少,为了防止模型过拟合,Dropout 可以作为一种 trikc 供选择。

    [1] https://arxiv.org/pdf/1207.0580.pdf

    关于 Dropout,Srivastava 与 Nitish 等技术专家也合著过《Dropout: a simple way to prevent neural networks from overfitting》一文。论文提出,拥有大量参数的深度神经网络是性能极其强大的机器学习系统,但过度拟合问题却成了系统中难以解决的一个大问题,而 Dropout 是处理这一问题的技术捷径。

    [2] http://www.jmlr.org/papers/volume15/srivastava14a.old/source/srivastava14a.pdf

    深度神经网络的训练是个复杂异常的活,因为训练中每一层参数的更改都会牵一发而动全身,而这一问题就造成训练效率低下。Ioffe、 Sergey 和 Christian Szegedy在《Batch normalization: Accelerating deep network training by reducing internal covariate shift》一文中着重介绍了解决这一问题的关键:内部协变量的转变。

    [3] https://arxiv.org/pdf/1502.03167.pdf

    深度神经网络的训练非常考验计算能力,而要想缩短训练时间,就必须让神经元的活动正常化,而最新引入的“批规范化”技术则是解决这一问题的突破口。完成技术突破的技术方式纠缠在多位专家合著的这份名为《Layer normalization》的论文中。

    [4] https://arxiv.org/pdf/1607.06450.pdf?utm_source=sciontist.com&utm_medium=refer&utm_campaign=promote

    《Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to+ 1 or−1》是今年2月份刚刚出炉的论文,论文的主要思想是通过二值化weights和activations,来提高NN的速度和减少其内存占用。由于二值网络只是将网络的参数和激活值二值化,并没有改变网络的结构,因此我们要关注如何二值化,以及二值化后参数如何更新。

    [5] https://pdfs.semanticscholar.org/f832/b16cb367802609d91d400085eb87d630212a.pdf

    《Decoupled neural interfaces using synthetic gradients》是一篇来自Google DeepMind很有意思的神经网络论文,论文中用合成的梯度来分解backprop中的关联关系,五星推荐。

    [6] https://arxiv.org/pdf/1608.05343.pdf

     

    2. 深度学习优化

    《On the importance of initialization and momentum in deep learning》一文介绍了初始化和Momentum技术在深度学习方面的重要性,更多的着眼在实验分析上。

    [7] http://www.jmlr.org/proceedings/papers/v28/sutskever13.pdf

    Adam是一种基于梯度的优化方法,与SDG类似。其具体信息可以参阅论文《Adam: A method for stochastic optimization》。

    [8] https://arxiv.org/pdf/1412.6980.pdf

    《Learning to learn by gradient descent by gradient descent》由 Andrychowicz 和 Marcin 等专家撰写而成,本文的思想是利用LSTM学习神经网络的更新策略,即利用梯度下降法学习一个优化器,然后用这个优化器去优化其他网络的参数。该文指导意义颇强,五星推荐。

    [9] https://arxiv.org/pdf/1606.04474.pdf

    斯坦福大学的 Song Han 与 Huizi Mao 等专家撰写了一系列有关网络压缩的论文,《Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding》是其中一篇,论文题目已经概括了文中的三个重点,非常清晰明了。同时它也荣获了 ICLR 2016 最佳论文,五星推荐。

    [10] https://pdfs.semanticscholar.org/5b6c/9dda1d88095fa4aac1507348e498a1f2e863.pdf

    《SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 1MB model size》由 Iandola 和 Forrest N 等专家撰写,开头论文先提了在相同精确度下,体积更小的深度神经网络有着3点好处。随后,提出了本文的创新 SqueezeNet 并给出了一个分类精度接近 AlexNet1 的网络,模型缩小 510 倍,还归纳了缩小模型尺寸时的设计思路。

    [11] https://arxiv.org/pdf/1602.07360.pdf

     

    3. 无监督学习/深层生成模型

    《Building high-level features using large scale unsupervised learning》讲述了 Google Brain 中特征学习的原理,通过使用未标记的图像学习人脸、猫脸特征,得到检测器。文章使用大数据构建了一个9层的局部连接稀疏自编码网络,使用模型并行化和异步 SGD 在 1000 个机器(16000核)上训练了 3 天,实验结果显示可以在未标记图像是否有人脸的情况下训练出一个人脸检测器。

    [12] https://arxiv.org/pdf/1112.6209.pdf&embed

    Kingma、 Diederik P 和 Max Welling 三位专家共同撰写了《Auto-encoding variational bayes》,该论文提出一个融合 Variational Bayes 方法和神经网络的方法,这个方法可以用来构造生成模型的自编码器。

    [13] https://arxiv.org/pdf/1312.6114.pdf

    《Generative adversarial nets》是 Ian Goodfellow 大神的 2014 年的论文,中文应该叫做对抗网络,在许多教程中作为非监督深度学习的代表作给予推广。本文解决了非监督学习中的著名问题:给定一批样本,训练一个系统,能够生成类似的新样本。五星推荐。

    [14] http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf

    《Unsupervised representation learning with deep convolutional generative adversarial networks》是在 GAN 的论文中提出的对抗模型的原型,本文给出了基于卷机网的实现。同时还描述了实现过程中的细节,比如参数设置。也提到了解决 GAN 中训练不稳定的措施,但是并非完全解决。文中还提到利用对抗生成网络来做半监督学习。在训练结束后,识别网络可以用来提取图片特征,输入有标签的训练图片,可以将卷基层的输出特征作为 X ,标签作为 Y 做训练。

    [15] https://arxiv.org/pdf/1511.06434.pdf

    《DRAW: A recurrent neural network for image generation》来自谷歌,描述了如何用 Deep Recurrent Attentive Writer (DRAW)神经网络框架自动生成图像,五星推荐。

    [16] http://jmlr.org/proceedings/papers/v37/gregor15.pdf

    Pixel recurrent neural networks》是谷歌 ICML 获奖论文,它解释了像素递归神经网络是如何帮图片“极致”建模的。在这篇文章中,作者在深度递归网络下建立了对自然图片的通用建模并显著提升了它的效率。此外,作者提出了一种新颖的二维 LSTM 层:ROW LSTM和 Diagonal BiLSTM,它能更容易扩展到其他数据上。

    [17] https://arxiv.org/pdf/1601.06759.pdf

    《Conditional Image Generation with PixelCNN Decoders》来自谷歌DeepMind团队。他们研究一种基于PixelCNN(像素卷积神经网络)架构的模型,可以根据条件的变化生成新的图像。如果该模型输入ImageNet图像库的分类标签照片,该模型能生成多变的真实场景的照片,比如动物、风景等。如果该模型输入其他卷积神经生成的未见过的人脸照片,该模型能生成同一个人的不同表情、姿势的照片。

    [18] https://arxiv.org/pdf/1606.05328.pdf

     

    4. 循环神经网络/序列到序列模式

    《Generating sequences with recurrent neural networks》一文由 Graves 和 Alex 两位专家合力撰写,这篇论文解释了用递归神经网络生成手写体的原理。

    [19] https://arxiv.org/pdf/1308.0850.pdf

    《Learning phrase representations using RNN encoder-decoder for statistical machine translation》完成了将英文转译为法文的任务,使用了一个 encoder-decoder 模型,在 encoder 的 RNN 模型中是将序列转化为一个向量。在 decoder 中是将向量转化为输出序列,使用 encoder-decoder 能够加入词语与词语之间的顺序信息。此外,还将序列表达为一个向量,利用向量能够清楚的看出那些语义上相近的词聚集在一起。

    [20] https://arxiv.org/pdf/1406.1078.pdf

    《Sequence to sequence learning with neural networks》是谷歌的 I. Sutskever 等人提出的一种序列到序列的学习方法, 最直接的应用就是机器翻译。

    [21] http://papers.nips.cc/paper/5346-information-based-learning-by-agents-in-unbounded-state-spaces.pdf

    Attention 机制最早是在视觉图像领域提出来的,随后 Bahdanau 等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》中,使用类似 attention 的机制在机器翻译任务上将翻译和对齐同时进行,他们算是第一个提出将 attention 机制应用到 NLP 领域中的团队。

    [22] https://arxiv.org/pdf/1409.0473v7.pdf

    《A Neural Conversational Model》是最早应用于序列到序列框架建立对话模型的论文,即便其中使用的模型结构并不复杂,网络层数数量也不多,但效果是却很可观。

    [23] https://arxiv.org/pdf/1506.05869.pdf

     

    5.神经图灵机

    《Neural turing machines》一文介绍了神经图灵机,一种从生物可行内存和数字计算机的启发产生的神经网络架构。如同传统的神经网络,这个架构也是可微的端对端的并且可以通过梯度下降进行训练。我们的实验展示了它有能力从样本数据中学习简单的算法并且能够将这些算法推广到更多的超越了训练样本本身的数据上。绝对的五星推荐。

    [24] https://arxiv.org/pdf/1410.5401.pdf

    神经图灵机是当前深度学习领域三大重要研究方向之一。论文《Reinforcement learning neural Turing machines》利用增强学习算法来对神经网络进行训练,从而使神经图灵机的界面变得表现力十足。

    [25] https://pdfs.semanticscholar.org/f10e/071292d593fef939e6ef4a59baf0bb3a6c2b.pdf

    《Memory networks》由四位专家撰写而成,实际上所谓的 Memory Network 是一个通用的框架而已,内部的输入映射、更新记忆映射、输出映射、响应映射都是可以更换的。

    [26] https://arxiv.org/pdf/1410.3916.pdf

    《End-to-end memory networks》在算法层面解决了让记忆网络端对端进行训练的问题,在应用方面则解决了问题回答和语言建模等问题。

    [27] http://papers.nips.cc/paper/5846-end-to-end-memory-networks.pdf

    《Pointer networks》中提出了一种新型的网络架构,用来学习从一个序列输入到一个序列输出的推导。跟以往的成果不同之处在于,输入输出的长度都是可变的,输出的长度跟输入有关。

    [28] http://papers.nips.cc/paper/5866-pointer-networks.pdf

    《Hybrid computing using a neural network with dynamic external memory》是谷歌 DeepMind 首发于《自然》杂志的论文,它介绍了一种记忆增强式的神经网络形式,其被称为可微神经计算机(differentiable neural computer),研究表明它可以学习使用记忆来回答有关复杂的结构化数据的问题,其中包括人工生成的故事、家族树、甚至伦敦地铁的地图。研究还表明它还能使用强化学习解决拼图游戏问题。五星推荐。

    [29] https://www.dropbox.com/s/0a40xi702grx3dq/2016-graves.pdf

     

    6. 深度强化学习

    终于!我们来到了深度强化学习的门下。说到这个名词,怎么能不提第一篇提出深度强化学习的论文呢?Mnih 所写的《Playing atari with deep reinforcement learning》将卷积神经网络和 Q Learning 结合,使用同一个网络玩 Atari 2600(也就是打方块)这类只需要短时记忆的 7 种游戏。结果显示,这种算法无需人工提取特征,还能生成无限样本以实现监督训练。

    [30] http://arxiv.org/pdf/1312.5602.pdf

    而至于深度强化学习的里程碑之作,同样要属同一作者的《Human-level control through deep reinforcement learning》,作者发明了一个名为DQN也就是深度Q网络的东西,让人工神经网络能直接从传感器的输入数据中获得物体分类,成功实现端到端的强化学习算法从高维的传感器输入中直接学习到成功策略。

    [31] http://www.davidqiu.com:8888/research/nature14236.pdf

    而接下来这篇名为《Dueling network architectures for deep reinforcement learning》的文章则提出了一个新的网络——竞争架构网络。它包括状态价值函数和状态依存动作优势函数。这一架构在多种价值相似的动作面前能引发更好的政策评估。此文当选 ICML 2016最佳论文大奖。

    [32] http://arxiv.org/pdf/1511.06581

    《Asynchronous methods for deep reinforcement learning》由 DeepMind 出品,主要增强了 Atari 2600 的游戏效果,也被视为通过多个实例采集样本进行异步更新的经典案例。

    [33] http://arxiv.org/pdf/1602.01783

    比起传统的规划方法,《Continuous control with deep reinforcement learning》里提到的DQL方法能够应用于连续动作领域,鲁棒解决了  20 个仿真运动,采用的是基于ICML 2014的Deterministic policy gradient (DPG)的 actor-critic 算法,名为 DDPG。

    [34] http://arxiv.org/pdf/1509.02971

    《Continuous Deep Q-Learning with Model-based Acceleration》采用了 Advantage Function 完成增强学习工作,但主要集中于变量连续行动空间。而就像标题所言,为了加快机器经验获取,研究还用卡尔曼滤波器加局部线性模型。实验结果显示,这种方法比前一篇论文提及的 DDPG 要好些。

    [35] http://arxiv.org/pdf/1603.00748

    Schulman的《Trust region policy optimization》可谓是计算机玩游戏的一大突破,这个名为 TRPO 的算法所呈现的结果丝毫不逊色于 DeepMind 的研究成果,展示了一种广义的学习能力。除了叫机器人走路,我们还能让它成为游戏高手。

    [36] http://www.jmlr.org/proceedings/papers/v37/schulman15.pdf

    接下来介绍的这篇论文就是鼎鼎大名的 AlphaGo 所运用的算法,《Mastering the game of Go with deep neural networks and tree search》里,谷歌运用了 13 层的策略网络,让计算机学会用蒙特卡罗搜索树玩围棋游戏。当然,五星推荐此篇,不服来辩。

    [37]  http://willamette.edu/~levenick/cs448/goNature.pdf

     

    7. 无监督特征学习

    《Deep Learning of Representations for Unsupervised and Transfer Learning》可谓无监督特征学习的开山之作。

    [38] http://www.jmlr.org/proceedings/papers/v27/bengio12a/bengio12a.pdf

    而接下来的这篇《Lifelong Machine Learning Systems: Beyond Learning Algorithms》主要提到的观点是,如果一个具有Lifelong Machine Learning能力的机器学习系统,是否能够使用解决此前问题的相关知识帮助它解决新遇到的问题,也就是举一反三的能力。文章在 2013 年的AAAI 春季研讨会上首次提出。

    [39] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.696.7800&rep=rep1&type=pdf

    人工智能教父又来了,他这次和 Dean 合作带来的是《Distilling the knowledge in a neural network》,也就是压缩神经网络。不过核心创新貌似不多,所以给个四星吧。

    [40] http://arxiv.org/pdf/1503.02531

    《Policy distillation》,文章由谷歌大神Andrei Alexandru Rusu 所写,同款文章还有 Parisotto 的《Actor-mimic: Deep multitask and transfer reinforcement learning》,都是在讲 RL 域的问题。

    [41] http://arxiv.org/pdf/1511.0629

    [42] http://arxiv.org/pdf/1511.06342

    这里还有另外一篇 Andrei 的文章,名为《Progressive neural networks》,提出了一项名为“渐进式神经网络”的算法,即在仿真环境中训练机器学习,随后就能把知识迁移到真实环境中。无疑,这将大大加速机器人的学习速度。

    [43] https://arxiv.org/pdf/1606.04671

     

    8. 一步之遥

    以下五篇论文虽然并不是完全针对深度学习而推荐,但包含的一些基本思想还是具有借鉴意义的。

    《Human-level concept learning through probabilistic program induction》五星推荐,文章主要介绍了贝叶斯学习程序(BPL)框架,“如何依靠简单的例子来对新概念进行学习和加工,学习主体是人类。”

    [44] http://clm.utexas.edu/compjclub/wp-content/uploads/2016/02/lake2015.pdf

    而读读 Koch 的《Siamese Neural Networks for One-shot Image Recognition》和这篇《One-shot Learning with Memory-Augmented Neural Networks》着实很有必要。

    [45] http://www.cs.utoronto.ca/~gkoch/files/msc-thesis.pdf

    [46]http://arxiv.org/pdf/1605.06065

    将重点放在大数据上的《Low-shot visual object recognition》则是走向图像识别的必要一步。

    [47]http://arxiv.org/pdf/1606.02819

    分类: 深度学习

    展开全文
  • 词向量论文深度学习论文fasttext词向量论文深度学习论文fasttext词向量论文深度学习论文fasttext词向量论文深度学习论文fasttext
  • Deep-Learning-Papers-Reading-Roadmap(深度学习论文阅读路线图) 深度学习基础及历史 1.0 书 深度学习圣经:Bengio, Yoshua, Ian J. Goodfellow, and Aaron Courville. “Deep learning.” An MIT Press book. ...

    Deep-Learning-Papers-Reading-Roadmap(深度学习论文阅读路线图)

    深度学习基础及历史

    1.0 书

    • 深度学习圣经:Bengio, Yoshua, Ian J. Goodfellow, and Aaron Courville. “Deep learning.” An MIT Press book. (2015)

    1.1 报告

    • 三巨头报告:LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. “Deep learning.” Nature 521.7553 (2015)

    1.2 深度信念网络 (DBN)

    • 深度学习前夜的里程碑:Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. “A fast learning algorithm for deep belief nets.” Neural computation 18.7 (2006)
    • 展示深度学习前景的里程碑:Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. “Reducing the dimensionality of data with neural networks.” Science 313.5786 (2006)

    1.3 ImageNet革命(深度学习大爆炸)

    • AlexNet的深度学习突破:Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. “Imagenet classification with deep convolutional neural networks.” Advances in neural information processing systems. 2012.
    • VGGNet深度神经网络出现:Simonyan, Karen, and Andrew Zisserman. “Very deep convolutional networks for large-scale image recognition.” arXiv preprint arXiv:1409.1556 (2014).
    • GoogLeNet:Szegedy, Christian, et al. “Going deeper with convolutions.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
    • ResNet极深度神经网络,CVPR最佳论文:He, Kaiming, et al. “Deep residual learning for image recognition.” arXiv preprint arXiv:1512.03385 (2015).

    1.4 语音识别革命

    • 语音识别突破:Hinton, Geoffrey, et al. “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups.” IEEE Signal Processing Magazine 29.6 (2012): 82-97.
    • RNN论文:Graves, Alex, Abdel-rahman Mohamed, and Geoffrey Hinton. “Speech recognition with deep recurrent neural networks.” 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013.
    • 端对端RNN语音识别:Graves, Alex, and Navdeep Jaitly. “Towards End-To-End Speech Recognition with Recurrent Neural Networks.” ICML. Vol. 14. 2014.
    • Google语音识别系统论文:Sak, Haşim, et al. “Fast and accurate recurrent neural network acoustic models for speech recognition.” arXiv preprint arXiv:1507.06947 (2015).
    • 百度语音识别系统论文:Amodei, Dario, et al. “Deep speech 2: End-to-end speech recognition in english and mandarin.” arXiv preprint arXiv:1512.02595 (2015).
    • 来自微软的当下最先进的语音识别论文:W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, G. Zweig “Achieving Human Parity in Conversational Speech Recognition.” arXiv preprint arXiv:1610.05256 (2016).

    深度学习方法

    2.1 模型

    • Dropout:Hinton, Geoffrey E., et al. “Improving neural networks by preventing co-adaptation of feature detectors.” arXiv preprint arXiv:1207.0580 (2012).
    • 过拟合:Srivastava, Nitish, et al. “Dropout: a simple way to prevent neural networks from overfitting.” Journal of Machine Learning Research 15.1 (2014): 1929-1958.
    • Batch归一化——2015年杰出成果:Ioffe, Sergey, and Christian Szegedy. “Batch normalization: Accelerating deep network training by reducing internal covariate shift.” arXiv preprint arXiv:1502.03167 (2015).
    • Batch归一化的升级:Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. “Layer normalization.” arXiv preprint arXiv:1607.06450 (2016).
    • 快速训练新模型:Courbariaux, Matthieu, et al. “Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to+ 1 or−1.”
    • 训练方法创新:Jaderberg, Max, et al. “Decoupled neural interfaces using synthetic gradients.” arXiv preprint arXiv:1608.05343 (2016).
    • 修改预训练网络以降低训练耗时:Chen, Tianqi, Ian Goodfellow, and Jonathon Shlens. “Net2net: Accelerating learning via knowledge transfer.” arXiv preprint arXiv:1511.05641 (2015).
    • 修改预训练网络以降低训练耗时:Wei, Tao, et al. “Network Morphism.” arXiv preprint arXiv:1603.01670 (2016).

    2.2 优化

    • 动量优化器:Sutskever, Ilya, et al. “On the importance of initialization and momentum in deep learning.” ICML (3) 28 (2013): 1139-1147.
    • 可能是当前使用最多的随机优化:Kingma, Diederik, and Jimmy Ba. “Adam: A method for stochastic optimization.” arXiv preprint arXiv:1412.6980 (2014).
    • 神经优化器:Andrychowicz, Marcin, et al. “Learning to learn by gradient descent by gradient descent.” arXiv preprint arXiv:1606.04474 (2016).
    • ICLR最佳论文,让神经网络运行更快的新方向:Han, Song, Huizi Mao, and William J. Dally. “Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding.” CoRR, abs/1510.00149 2 (2015).
    • 优化神经网络的另一个新方向:Iandola, Forrest N., et al. “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 1MB model size.” arXiv preprint arXiv:1602.07360 (2016).

    2.3 无监督学习 / 深度生成式模型

    • Google Brain找猫的里程碑论文,吴恩达:Le, Quoc V. “Building high-level features using large scale unsupervised learning.” 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013.
    • 变分自编码机 (VAE):Kingma, Diederik P., and Max Welling. “Auto-encoding variational bayes.” arXiv preprint arXiv:1312.6114 (2013).
    • 生成式对抗网络 (GAN):Goodfellow, Ian, et al. “Generative adversarial nets.” Advances in Neural Information Processing Systems. 2014.
    • 解卷积生成式对抗网络 (DCGAN):Radford, Alec, Luke Metz, and Soumith Chintala. “Unsupervised representation learning with deep convolutional generative adversarial networks.” arXiv preprint arXiv:1511.06434 (2015).
    • Attention机制的变分自编码机:Gregor, Karol, et al. “DRAW: A recurrent neural network for image generation.” arXiv preprint arXiv:1502.04623 (2015).
    • PixelRNN:Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. “Pixel recurrent neural networks.” arXiv preprint arXiv:1601.06759 (2016).
    • PixelCNN:Oord, Aaron van den, et al. “Conditional image generation with PixelCNN decoders.” arXiv preprint arXiv:1606.05328 (2016).

    2.4 RNN / 序列到序列模型

    • RNN的生成式序列,LSTM:Graves, Alex. “Generating sequences with recurrent neural networks.” arXiv preprint arXiv:1308.0850 (2013).
    • 第一份序列到序列论文:Cho, Kyunghyun, et al. “Learning phrase representations using RNN encoder-decoder for statistical machine translation.” arXiv preprint arXiv:1406.1078 (2014).
    • 神经机器翻译:Bahdanau, Dzmitry, KyungHyun Cho, and Yoshua Bengio. “Neural Machine Translation by Jointly Learning to Align and Translate.” arXiv preprint arXiv:1409.0473 (2014).
    • 序列到序列Chatbot:Vinyals, Oriol, and Quoc Le. “A neural conversational model.” arXiv preprint arXiv:1506.05869 (2015).

    2.5 神经网络图灵机

    • 未来计算机的基本原型:Graves, Alex, Greg Wayne, and Ivo Danihelka. “Neural turing machines.” arXiv preprint arXiv:1410.5401 (2014).
    • 强化学习神经图灵机:Zaremba, Wojciech, and Ilya Sutskever. “Reinforcement learning neural Turing machines.” arXiv preprint arXiv:1505.00521 362 (2015).
    • 记忆网络:Weston, Jason, Sumit Chopra, and Antoine Bordes. “Memory networks.” arXiv preprint arXiv:1410.3916 (2014).
    • 端对端记忆网络:Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. “End-to-end memory networks.” Advances in neural information processing systems. 2015.
    • 指针网络:Vinyals, Oriol, Meire Fortunato, and Navdeep Jaitly. “Pointer networks.” Advances in Neural Information Processing Systems. 2015.

    2.6 深度强化学习

    • 第一篇以深度强化学习为名的论文:Mnih, Volodymyr, et al. “Playing atari with deep reinforcement learning.” arXiv preprint arXiv:1312.5602 (2013).
    • 里程碑:Mnih, Volodymyr, et al. “DeepMind:Human-level control through deep reinforcement learning.” Nature 518.7540 (2015): 529-533.
    • ICLR最佳论文:Wang, Ziyu, Nando de Freitas, and Marc Lanctot. “Dueling network architectures for deep reinforcement learning.” arXiv preprint arXiv:1511.06581 (2015).
    • 当前最先进的深度强化学习方法:Mnih, Volodymyr, et al. “Asynchronous methods for deep reinforcement learning.” arXiv preprint arXiv:1602.01783 (2016).
    • DDPG:Lillicrap, Timothy P., et al. “Continuous control with deep reinforcement learning.” arXiv preprint arXiv:1509.02971 (2015).
    • NAF:Gu, Shixiang, et al. “Continuous Deep Q-Learning with Model-based Acceleration.” arXiv preprint arXiv:1603.00748 (2016).
    • TRPO:Schulman, John, et al. “Trust region policy optimization.” CoRR, abs/1502.05477 (2015).
    • AlphaGo:Silver, David, et al. “Mastering the game of Go with deep neural networks and tree search.” Nature 529.7587 (2016): 484-489.

    2.7 深度迁移学习 / 终生学习 / 强化学习

    • Bengio教程:Bengio, Yoshua. “Deep Learning of Representations for Unsupervised and Transfer Learning.” ICML Unsupervised and Transfer Learning 27 (2012): 17-36.
    • 终生学习的简单讨论:Silver, Daniel L., Qiang Yang, and Lianghao Li. “Lifelong Machine Learning Systems: Beyond Learning Algorithms.” AAAI Spring Symposium: Lifelong Machine Learning. 2013.
    • Hinton、Jeff Dean大神研究:Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. “Distilling the knowledge in a neural network.” arXiv preprint arXiv:1503.02531 (2015).
    • 强化学习策略:Rusu, Andrei A., et al. “Policy distillation.” arXiv preprint arXiv:1511.06295 (2015).
    • 多任务深度迁移强化学习:Parisotto, Emilio, Jimmy Lei Ba, and Ruslan Salakhutdinov. “Actor-mimic: Deep multitask and transfer reinforcement learning.” arXiv preprint arXiv:1511.06342 (2015).
    • 累进式神经网络:Rusu, Andrei A., et al. “Progressive neural networks.” arXiv preprint arXiv:1606.04671 (2016).

    2.8 一次性深度学习

    • 不涉及深度学习,但值得一读:Lake, Brenden M., Ruslan Salakhutdinov, and Joshua B. Tenenbaum. “Human-level concept learning through probabilistic program induction.” Science 350.6266 (2015): 1332-1338.
    • 一次性图像识别(暂无):Koch, Gregory, Richard Zemel, and Ruslan Salakhutdinov. “Siamese Neural Networks for One-shot Image Recognition.”(2015). pdf
    • 一次性学习基础(暂无):Santoro, Adam, et al. “One-shot Learning with Memory-Augmented Neural Networks.” arXiv preprint arXiv:1605.06065 (2016). pdf
    • 一次性学习网络:Vinyals, Oriol, et al. “Matching Networks for One Shot Learning.” arXiv preprint arXiv:1606.04080 (2016).
    • 大型数据(暂无):Hariharan, Bharath, and Ross Girshick. “Low-shot visual object recognition.” arXiv preprint arXiv:1606.02819 (2016). pdf

    应用

    3.1 自然语言处理 (NLP)

    • Antoine Bordes, et al. “Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing.” AISTATS(2012)
    • word2vec Mikolov, et al. “Distributed representations of words and phrases and their compositionality.” ANIPS(2013): 3111-3119
    • Sutskever, et al. “Sequence to sequence learning with neural networks.” ANIPS(2014)
      http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf
    • Ankit Kumar, et al. “Ask Me Anything: Dynamic Memory Networks for Natural Language Processing.” arXiv preprint arXiv:1506.07285(2015)
    • Yoon Kim, et al. “Character-Aware Neural Language Models.” NIPS(2015) arXiv preprint arXiv:1508.06615(2015)
      https://arxiv.org/abs/1508.06615
    • bAbI任务:Jason Weston, et al. “Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks.” arXiv preprint arXiv:1502.05698(2015)
    • CNN / DailyMail 风格对比:Karl Moritz Hermann, et al. “Teaching Machines to Read and Comprehend.” arXiv preprint arXiv:1506.03340(2015)
    • 当前最先进的文本分类:Alexis Conneau, et al. “Very Deep Convolutional Networks for Natural Language Processing.” arXiv preprint arXiv:1606.01781(2016)
    • 稍次于最先进方案,但速度快很多:Armand Joulin, et al. “Bag of Tricks for Efficient Text Classification.” arXiv preprint arXiv:1607.01759(2016)

    3.2 目标检测

    • Szegedy, Christian, Alexander Toshev, and Dumitru Erhan. “Deep neural networks for object detection.” Advances in Neural Information Processing Systems. 2013.
    • RCNN:Girshick, Ross, et al. “Rich feature hierarchies for accurate object detection and semantic segmentation.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
    • SPPNet(暂无):He, Kaiming, et al. “Spatial pyramid pooling in deep convolutional networks for visual recognition.” European Conference on Computer Vision. Springer International Publishing, 2014. pdf
    • Girshick, Ross. “Fast r-cnn.” Proceedings of the IEEE International Conference on Computer Vision. 2015.
    • 相当实用的YOLO项目:Redmon, Joseph, et al. “You only look once: Unified, real-time object detection.” arXiv preprint arXiv:1506.02640 (2015).
    • (暂无)Liu, Wei, et al. “SSD: Single Shot MultiBox Detector.” arXiv preprint arXiv:1512.02325 (2015). pdf
    • (暂无)Dai, Jifeng, et al. “R-FCN: Object Detection via Region-based Fully Convolutional Networks.” arXiv preprint arXiv:1605.06409 (2016). pdf
    • (暂无)He, Gkioxari, et al. “Mask R-CNN” arXiv preprint arXiv:1703.06870 (2017). pdf

    3.3 视觉追踪

    • 第一份采用深度学习的视觉追踪论文,DLT追踪器:Wang, Naiyan, and Dit-Yan Yeung. “Learning a deep compact image representation for visual tracking.” Advances in neural information processing systems. 2013.
    • SO-DLT(暂无):Wang, Naiyan, et al. “Transferring rich feature hierarchies for robust visual tracking.” arXiv preprint arXiv:1501.04587 (2015). pdf
    • FCNT:Wang, Lijun, et al. “Visual tracking with fully convolutional networks.” Proceedings of the IEEE International Conference on Computer Vision. 2015.
    • 跟深度学习一样快的非深度学习方法,GOTURN(暂无):Held, David, Sebastian Thrun, and Silvio Savarese. “Learning to Track at 100 FPS with Deep Regression Networks.” arXiv preprint arXiv:1604.01802 (2016). pdf
    • 新的最先进的实时目标追踪方案 SiameseFC(暂无):Bertinetto, Luca, et al. “Fully-Convolutional Siamese Networks for Object Tracking.” arXiv preprint arXiv:1606.09549 (2016). pdf
    • C-COT:Martin Danelljan, Andreas Robinson, Fahad Khan, Michael Felsberg. “Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking.” ECCV (2016)
    • VOT2016大赛冠军 TCNN(暂无):Nam, Hyeonseob, Mooyeol Baek, and Bohyung Han. “Modeling and Propagating CNNs in a Tree Structure for Visual Tracking.” arXiv preprint arXiv:1608.07242 (2016). pdf

    3.4 图像标注

    • Farhadi,Ali,etal. “Every picture tells a story: Generating sentences from images”. In Computer VisionECCV 201match0. Spmatchringer Berlin Heidelberg:15-29, 2010.
    • Kulkarni, Girish, et al. “Baby talk: Understanding and generating image descriptions”. In Proceedings of the 24th CVPR, 2011.
    • (暂无)Vinyals, Oriol, et al. “Show and tell: A neural image caption generator”. In arXiv preprint arXiv:1411.4555, 2014. pdf
    • RNN视觉识别与标注(暂无):Donahue, Jeff, et al. “Long-term recurrent convolutional networks for visual recognition and description”. In arXiv preprint arXiv:1411.4389 ,2014. pdf
    • 李飞飞及高徒Andrej Karpathy:Karpathy, Andrej, and Li Fei-Fei. “Deep visual-semantic alignments for generating image descriptions”. In arXiv preprint arXiv:1412.2306, 2014.
    • 李飞飞及高徒Andrej Karpathy(暂无):Karpathy, Andrej, Armand Joulin, and Fei Fei F. Li. “Deep fragment embeddings for bidirectional image sentence mapping”. In Advances in neural information processing systems, 2014. pdf
    • (暂无)Fang, Hao, et al. “From captions to visual concepts and back”. In arXiv preprint arXiv:1411.4952, 2014. pdf
    • (暂无)Chen, Xinlei, and C. Lawrence Zitnick. “Learning a recurrent visual representation for image caption generation”. In arXiv preprint arXiv:1411.5654, 2014. pdf
    • (暂无)Mao, Junhua, et al. “Deep captioning with multimodal recurrent neural networks (m-rnn)”. In arXiv preprint arXiv:1412.6632, 2014. pdf
    • (暂无)Xu, Kelvin, et al. “Show, attend and tell: Neural image caption generation with visual attention”. In arXiv preprint arXiv:1502.03044, 2015. pdf

    3.5 机器翻译

    • Luong, Minh-Thang, et al. “Addressing the rare word problem in neural machine translation.” arXiv preprint arXiv:1410.8206 (2014).
    • Sennrich, et al. “Neural Machine Translation of Rare Words with Subword Units”. In arXiv preprint arXiv:1508.07909, 2015.
    • Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. “Effective approaches to attention-based neural machine translation.” arXiv preprint arXiv:1508.04025 (2015).
    • Chung, et al. “A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation”. In arXiv preprint arXiv:1603.06147, 2016.
    • Lee, et al. “Fully Character-Level Neural Machine Translation without Explicit Segmentation”. In arXiv preprint arXiv:1610.03017, 2016.
    • Wu, Schuster, Chen, Le, et al. “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation”. In arXiv preprint arXiv:1609.08144v2, 2016.

    3.6 机器人

    • Koutník, Jan, et al. “Evolving large-scale neural networks for vision-based reinforcement learning.” Proceedings of the 15th annual conference on Genetic and evolutionary computation. ACM, 2013.
    • Levine, Sergey, et al. “End-to-end training of deep visuomotor policies.” Journal of Machine Learning Research 17.39 (2016): 1-40.
    • Pinto, Lerrel, and Abhinav Gupta. “Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours.” arXiv preprint arXiv:1509.06825 (2015).
    • Levine, Sergey, et al. “Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection.” arXiv preprint arXiv:1603.02199 (2016).
    • Zhu, Yuke, et al. “Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning.” arXiv preprint arXiv:1609.05143 (2016).
    • Yahya, Ali, et al. “Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search.” arXiv preprint arXiv:1610.00673 (2016).
    • Gu, Shixiang, et al. “Deep Reinforcement Learning for Robotic Manipulation.” arXiv preprint arXiv:1610.00633 (2016).
    • A Rusu, M Vecerik, Thomas Rothörl, N Heess, R Pascanu, R Hadsell.”Sim-to-Real Robot Learning from Pixels with Progressive Nets.” arXiv preprint arXiv:1610.04286 (2016).
    • Mirowski, Piotr, et al. “Learning to navigate in complex environments.” arXiv preprint arXiv:1611.03673 (2016).

    3.7 艺术

    • Mordvintsev, Alexander; Olah, Christopher; Tyka, Mike (2015). “Inceptionism: Going Deeper into Neural Networks”. Google Research.
    • 当前最为成功的艺术风格迁移方案,Prisma:Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. “A neural algorithm of artistic style.” arXiv preprint arXiv:1508.06576 (2015).
    • iGAN:Zhu, Jun-Yan, et al. “Generative Visual Manipulation on the Natural Image Manifold.” European Conference on Computer Vision. Springer International Publishing, 2016.
    • Neural Doodle:Champandard, Alex J. “Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks.” arXiv preprint arXiv:1603.01768 (2016).
    • Zhang, Richard, Phillip Isola, and Alexei A. Efros. “Colorful Image Colorization.” arXiv preprint arXiv:1603.08511 (2016).
    • 超分辨率,李飞飞:Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. “Perceptual losses for real-time style transfer and super-resolution.” arXiv preprint arXiv:1603.08155 (2016).
    • Vincent Dumoulin, Jonathon Shlens and Manjunath Kudlur. “A learned representation for artistic style.” arXiv preprint arXiv:1610.07629 (2016).
    • 基于空间位置、色彩信息与空间尺度的风格迁移:Gatys, Leon and Ecker, et al.”Controlling Perceptual Factors in Neural Style Transfer.” arXiv preprint arXiv:1611.07865 (2016).
    • 纹理生成与风格迁移:Ulyanov, Dmitry and Lebedev, Vadim, et al. “Texture Networks: Feed-forward Synthesis of Textures and Stylized Images.” arXiv preprint arXiv:1603.03417(2016).

    3.8 目标分割

    • J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation.” in CVPR, 2015.
    • L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. “Semantic image segmentation with deep convolutional nets and fully connected crfs.” In ICLR, 2015.
    • Pinheiro, P.O., Collobert, R., Dollar, P. “Learning to segment object candidates.” In: NIPS. 2015.
    • Dai, J., He, K., Sun, J. “Instance-aware semantic segmentation via multi-task network cascades.” in CVPR. 2016
    • Dai, J., He, K., Sun, J. “Instance-sensitive Fully Convolutional Networks.” arXiv preprint arXiv:1603.08678 (2016).

    其他

    4.0 补充

    • Big Data Mining.Deep Learning with Tensorflow(Google TensorFlow 深度学习)
    • Introduction to TensorFlow, Alejandro Solano - EuroPython 2017
    • Learning with TensorFlow, A Mathematical Approach to Advanced Artificial Intelligence in Python
    • Deep Learning with Python
    • Deep Learning with TensorFlow
    展开全文
  • 100篇+深度学习论文合集100篇+深度学习论文合集100篇+深度学习论文合集100篇+深度学习论文合集
  • 深度学习论文集-2020.08.17 深度学习论文集-2020.08.17 深度学习论文集-2020.08.17 深度学习论文集-2020.08.17 深度学习论文集-2020.08.17
  • 深度学习论文解读

    2020-09-14 11:20:23
    本课程主要以时间线为基础,详细讲解深度学习领域最重要的一些论文,例如: ReLU,Dropout,AlexNet,VGGNet,Batch Normalization,ResNet,Inception系列,ResNeXt,SENet,GPT-3等
  • 深度学习论文综述

    千次阅读 2017-07-24 15:02:32
    深度学习论文 论文译文: AlexNet ZFNet VGG GoogLeNet ResNet Faster R-CNN GAN 目标检测 RCNN系列简介 深度学习RCNN系列详解 RCNN论文笔记 Faster RCNN详解
    展开全文
  • 唐宇迪博士深度学习课程系列中深度学习论文集,都是很经典的深度学习论文,高清版本,非常适合对该领域感兴趣的读者阅读研究,其中涉及了深度学习理论的方方面面。
  • 深度学习论文汇总

    千次阅读 2018-12-24 13:43:09
    “读万卷书,行万里路”,深度学习领域每时每刻都在萌生新的灵感和想法。要成为这方面的大牛,我想理论知识、代码功底...这篇博客整理出一些优秀深度学习论文,也是对自己学习过程的一些记录吧,不断地学习state-of-...

      宣传一波个人博客 https://hongbb.top,大家有空去玩玩23333333

      “读万卷书,行万里路”,深度学习领域每时每刻都在萌生新的灵感和想法。要成为这方面的大牛,我想理论知识、代码功底都得多多锻炼。我们不仅仅要对某一个方向深入了解,更要对CV这个领域有一个全面的认识。所以,读paper肯定是不能少的啦,从ImageNet比赛,到目标检测、图像分割,都有许多许多优秀的论文。这篇博客整理出一些优秀深度学习论文,也是对自己学习过程的一些记录吧,不断地学习state-of-the-art论文中的最新思想,这样才能跟得上时代的步伐吧~

    深度学习大爆发:ImageNet 挑战赛

      ImageNet 挑战赛属于深度学习最基础的任务:分类。从最早最早的LeNet,到后来的GoogleNet,再到现在的Shufflenet,涌现了一大批优秀的卷积神经网络框架。这些框架也被广泛用于目标检测等更复杂的深度学习任务中作为backbone,用来提取图像的特征。各种state-of-the-art的CNN框架,也是我们首要学习的知识。

    • (LeNet) Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, “Gradient-based learning applied to document recognition,” in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.[PDF] ,CNN的开山之作,也是手写体识别经典论文
    • (AlexNet) Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems. 2012 [PDF]ILSVRC-2012冠军,CNN历史上的转折,也是深度学习第一次在图像识别的任务上超过了SVM等传统的机器学习方法
    • (VGG) Simonyan, Karen, and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).[PDF] 使用了大量的重复卷积层,对后面的网络产生了重要影响
    • (GoogLeNet) Szegedy, Christian, et al. Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. [PDF] 提出Inception模块,第一次在CNN中使用并行结构,后来的ResNet等都借鉴了该思想,CNN不再是一条路走到底的网络结构了
    • (InceptionV2、InceptionV3) Szegedy C, Vanhoucke V, Ioffe S, et al. Rethinking the Inception Architecture for Computer Vision[J]. Computer Science, 2015:2818-2826.[PDF]由于BN(Batch Normalization)等提出,改进了原始GoogLeNet中的Inception模块
    • (ResNet) He, Kaiming, et al. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015).[PDF] 提出残差结构,解决了深度学习网络层数太深梯度消失等问题,ResNet当时的层数达到了101层。
    • (Xception) Chollet F. Xception: Deep Learning with Depthwise Separable Convolutions[J]. arXiv preprint arXiv:1610.02357, 2016.[PDF]
    • (DenseNet) Huang G, Liu Z, Weinberger K Q, et al. Densely Connected Convolutional Networks[J]. 2016. [PDF] 将shortcut思想发挥到极致
    • (SeNet) Squeeze-and-Excitation Networks. [PDF] 主打融合通道间的信息(channel-wise),并且只增加微量计算
    • (MobileNet v1) Howard A G, Zhu M, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[J]. arXiv preprint arXiv:1704.04861, 2017. [PDF]
    • (Shufflenet) Zhang X, Zhou X, Lin M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices[J]. [PDF] 使用shuffle操作来代替1x1卷积,实现通道信息融合,大大减小了参数量,主要面向一些计算能力不足的移动设备。
    • (capsules) Sabour S, Frosst N, Hinton G E. Dynamic routing between capsules[C][PDF]
    • (Partial Labels) Durand, Thibaut, Nazanin Mehrasa and Greg Mori. “Learning a Deep ConvNet for Multi-label Classification with Partial Labels.” CoRR abs/1902.09720 (2019): n. pag.[PDF] 多标签分类
    • (Res2Net) Gao, Shang-Hua, Ming-Ming Cheng, Kai Zhao, Xin-yu Zhang, Ming-Hsuan Yang and Philip H. S. Torr. “Res2Net: A New Multi-scale Backbone Architecture.” (2019). [PDF]
    • (Residual Attention Network) Wang, F., Jiang, M., Qian, C., Yang, S., Li, C.C., Zhang, H., Wang, X., & Tang, X. (2017). Residual Attention Network for Image Classification. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6450-6458.[PDF]
    • (Deformable CNN) Dai, Jifeng et al. “Deformable Convolutional Networks.” 2017 IEEE International Conference on Computer Vision (ICCV) (2017): 764-773.[PDF]
    • (GCNet) Cao, Yue et al. “GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond.” CoRR abs/1904.11492 (2019): n. pag.[PDF]
    • (NASNet) Zoph, B., Vasudevan, V., Shlens, J., & Le, Q.V. (2018). Learning Transferable Architectures for Scalable Image Recognition. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8697-8710.[PDF]

    物体检测

      深度学习另外一个重要的任务就是物体检测,在1990年以前,典型的物体检测方法是基于 geometric representations,之后物体检测的方法像统计分类的方向发展(神经网络、SVM、Adaboost等)。
      2012年当深度神经网络(DCNN)在图像分类上取得了突破性进展时,这个巨大的成功也被用到了物体检测上。Girshick提出了里程碑式的物体检测模型Region based CNN(RCNN),在此之后物体检测领域飞速发展、并且提出了许多基于深度学习的方法,如YOLO、SSD等…

    • (R-CNN) Girshick, Ross, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.[PDF] 里程碑式的物体检测框架,RCNN系列的开山鼻祖,后续深度学习的物体检测都借鉴了思想,不得不读的paper
    • (SPPNet) He K, Zhang X, Ren S, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//European Conference on Computer Vision. Springer International Publishing, 2014: 346-361.[PDF] 主要改进了R-CNN中计算过慢重复提取特征的问题
    • (Fast R-CNN) Girshick R. Fast r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision. 2015: 1440-1448.[PDF] RCNN系列的第二版,提出RoI Pooling,同时改进了 R-CNN 和 SPPNet,同时提高了速度和精度
    • (Faster R-CNN) Ren S, He K, Girshick R, et al. Faster r-cnn: Towards real-time object detection with region proposal networks[C]//Advances in neural information processing systems. 2015: 91-99.[PDF] R-CNN系列巅峰,提出了anchor、RPN等方法,广泛被后续网络采用。
    • (YOLO) Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 779-788.[PDF] One-Stage目标检测框架代表之一,速度非常快,不过精度不如R-CNN系列
    • (SSD) Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multibox detector[C]//European Conference on Computer Vision. Springer International Publishing, 2016: 21-37.[PDF] One-Stage目标检测框架代表之二,提高速度的同时又不降低精度
    • (R-FCN) Li Y, He K, Sun J. R-fcn: Object detection via region-based fully convolutional networks[C]//Advances in Neural Information Processing Systems. 2016: 379-387.[PDF]
    • (DSSD) Fu, C., Liu, W., Ranga, A., Tyagi, A., & Berg, A.C. (2017). DSSD : Deconvolutional Single Shot Detector. CoRR, abs/1701.06659.[PDF] 和FPN的思想有类似,采用deconvolution,进行了特征融合,提高了SSD在小物体,重叠物体上的检测精度
    • (FPN) T. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie, “Feature Pyramid Networks for Object Detection,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017 [PDF] 提出特征图金字塔,让卷积神经网络中深层中提取的语义信息融合到每一层的特征图中(特别是底层的高分辨率特征图也能获得高层的语义信息),提高特征图多尺度的表达,提高了一些小目标的识别精度,在图像分割和物体监测中都可用到。
    • (RetinaNet) Lin, T., Goyal, P., Girshick, R.B., He, K., & Dollár, P. (2017). Focal Loss for Dense Object Detection. 2017 IEEE International Conference on Computer Vision (ICCV), 2999-3007.[PDF] 提出了FocalLoss 解决物体检测中负类样本过多,类别不平衡的问题
    • (TDM) Shrivastava, Abhinav, Rahul Sukthankar, Jitendra Malik and Abhinav Gupta. “Beyond Skip Connections: Top-Down Modulation for Object Detection.” CoRR abs/1612.06851 (2016): n. pag.[PDF] 和FPN思想类似,不过文中提出的方法是一层一层的添加top-down模块
    • (YOLO-v2) Redmon, J., & Farhadi, A. (2017). YOLO9000: Better, Faster, Stronger. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 6517-6525.[PDF] YOLO的改进版本
    • (SIN) Liu, Y., Wang, R., Shan, S., & Chen, X. (2018). Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships. CVPR.[PDF] 将RNN用于CV中,很新颖的网络结构
    • (STDN) Scale-Transferrable Object Detection Peng Zhou, Bingbing Ni, Cong Geng, Jianguo Hu, Yi Xu; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 528-537 [PDF] 主要是改进DSSD、FPN等结构中由于特征融合而引入额外参数,导致速度变慢问题。提出用DenseNet作为Backbone 从而在forward 时候进行特征融合,并提出了不带参数的scale-transform module ,保证精度的同时提高速度
    • (RefineDet) Shrivastava, Abhinav et al. “Training Region-Based Object Detectors with Online Hard Example Mining.” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016): 761-769.[PDF] 结合了one-stage 和 two-stage的优点
    • (MegDet) Peng, Chao, Tete Xiao, Zeming Li, Yuning Jiang, Xiangyu Zhang, Kai Jia, Gang Yu and Jian Sun. “MegDet: A Large Mini-Batch Object Detector.” CVPR (2018).[PDF] 关注于物体检测训练过程中batch size过小的问题,提出Cross-GPU Batch Normalization,训练时能达到256的batch size,coco2017数据集训练时间缩短到4小时。
    • (DA Faster R-CNN) Yuhua Chen, Wen Li, Christos Sakaridis, Dengxin Dai, Luc Van Gool; The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 3339-3348[PDF]主要是解决物体检测中domain shift问题,提出了domain adaptation component,能够训练出一个domain invariant的鲁棒网络,在大雨天,雾天等复杂场景也能达到很好的检测精度
    • (ExtremeNet) Bottom-up Object Detection by Grouping Extreme and Center Points,Xingyi Zhou, Jiacheng Zhuo, Philipp Krähenbühl,arXiv technical report (1901.08043) 将关键点检测的方法用在了物体检测上面
    • (RelationNet) Han Hu, Jiayuan Gu, Zheng Zhang, Jifeng Dai, Yichen Wei:Relation Networks for Object Detection. CVPR 2018: 3588-3597[PDF]
    • (YOLOv3) YOLOv3: An Incremental Improvement | Joseph Redmon, Ali Farhadi arXiv’18 [PDF]
    • (Cascade R-CNN) Zhaowei Cai, Nuno Vasconcelos:Cascade R-CNN: Delving Into High Quality Object Detection. CVPR 2018: 6154-6162[PDF]
    • (RFBNet) Liu, Songtao et al. “Receptive Field Block Net for Accurate and Fast Object Detection.” ECCV (2018).[PDF]
    • Zhong, Y., Wang, J., Peng, J., & Zhang, L. (2018). Anchor Box Optimization for Object Detection. CoRR, abs/1812.00469.[PDF]
    • (CornerNet) CornerNet: Detecting Objects as Paired Keypoints Hei Law, Jia Deng European Conference on Computer Vision (ECCV), 2018
    • (Grid R-CNN) Lu, X., Li, B., Yue, Y., Li, Q., & Yan, J. (2018). Grid R-CNN. CoRR, abs/1811.12030.[PDF]
    • (SNIPER) Singh, B., Najibi, M., & Davis, L.S. (2018). SNIPER: Efficient Multi-Scale Training. NeurIPS.[PDF]
    • (TridentNet) Li, Yanghao et al. “Scale-Aware Trident Networks for Object Detection.” CoRR abs/1901.01892 (2019): n. pag.[PDF]
    • (GIoU) Rezatofighi, Seyed Hamid et al. “Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression.” CoRR abs/1902.09630 (2019): n. pag.[PDF] 一种新型的IoU计算方法以及loss
    • (MetaAnchor) Yang, Tong et al. “MetaAnchor: Learning to Detect Objects with Customized Anchors.” NeurIPS (2018). [PDF]
    • (M2Det) Zhao, Q., Sheng, T., Wang, Y., Tang, Z., Chen, Y., Cai, L., & Ling, H. (2019). M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network. CoRR, abs/1811.04533.
    • Kong, T., Sun, F., Liu, H., Jiang, Y., & Shi, J. (2019). Consistent Optimization for Single-Shot Object Detection. CoRR, abs/1901.06563. [PDF] 针对物体检测one-stage中的一些改进方案
    • (RepMet) Schwartz, Eli et al. “RepMet: Representative-based metric learning for classification and one-shot object detection.” CoRR abs/1806.04728 (2018): n. pag.[PDF]
    • (Guided Anchor) Wang, Jiaqi et al. “Region Proposal by Guided Anchoring.” CoRR abs/1901.03278 (2019): n. pag. [PDF] 提出了一种新型的学习式的anchor生成方案
    • (FCOS) Tian, Z., Shen, C., Chen, H., & He, T. (2019). FCOS: Fully Convolutional One-Stage Object Detection. [PDF]
    • (KL-Loss) He, Y., Zhu, C., Wang, J., Savvides, M., & Zhang, X. (2018). Bounding Box Regression with Uncertainty for Accurate Object Detection. [PDF]
    • (ScrathDet) Zhu, Rui et al. “ScratchDet: Exploring to Train Single-Shot Object Detectors from Scratch.” CoRR abs/1810.08425 (2018): n. pag.[PDF]
    • Zhu, Chenchen et al. “Feature Selective Anchor-Free Module for Single-Shot Object Detection.” CoRR abs/1903.00621 (2019): n. pag.[PDF]
    • (FoveaBox) Kong, Tao & Sun, Fuchun & Liu, Huaping & Jiang, Yuning & Shi, Jianbo. (2019). FoveaBox: Beyond Anchor-based Object Detector. [PDF]
    • (Libra R-CNN) Pang, Jiangmiao, Kai Chen, Jianping Shi, Huajun Feng, Wanli Ouyang and Dahua Lin. “Libra R-CNN: Towards Balanced Learning for Object Detection.” (2019). [PDF]
    • (R-DAD) Bae, Seung-Hwan. “Object Detection based on Region Decomposition and Assembly.” CoRR abs/1901.08225 (2019): n. pag.[PDF]
    • Saito, Kuniaki et al. “Strong-Weak Distribution Alignment for Adaptive Object Detection.” CoRR abs/1812.04798 (2018): n. pag. [PDF]
    • (AP-Loss) Chen, K. , Li, J. , Lin, W. , See, J. , Wang, J. , & Duan, L. , et al. (2019). Towards accurate one-stage object detection with ap-loss. [PDF]
    • (RepPoints) Yang, Ze et al. “RepPoints: Point Set Representation for Object Detection.” CoRR abs/1904.11490 (2019): n. pag. [PDF]
    • (YOLOv3+) Derakhshani, M.M., Masoudnia, S., Shaker, A.H., Mersa, O., Sadeghi, M.A., Rastegari, M., & Araabi, B.N. (2019). Assisted Excitation of Activations: A Learning Technique to Improve Object Detectors.[PDF] 将分割等辅助信息引入优化物体检测的分支中,提高检测效果
    • (PASSD) Jang, H., Woo, S., Benz, P., Park, J., & Kweon, I.S. (2019). Propose-and-Attend Single Shot Detector.[PDF]
    • (DR-Loss) Qian, Q., Lei, C., Li, H., & Jin, R. (2019). DR Loss: Improving Object Detection by Distributional Ranking. ArXiv, abs/1907.10156.[PDF]
    • Chen, Joya et al. “Are Sampling Heuristics Necessary in Object Detectors?” (2019).[PDF]
    • (CBNet) Liu, Y., Wang, Y., Wang, S., Liang, T., Zhao, Q., Tang, Z., & Ling, H. (2019). CBNet: A Novel Composite Backbone Network Architecture for Object Detection.[PDF]

    深度学习一些tricks以及CNN网络结构的改善

    • (BatchNorm) Ioffe, Sergey and Christian Szegedy. “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift.” ICML (2015).[PDF]训练过程中的大杀器,可以加速模型收敛,并且训练过程数值更加稳定
    • Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, Mu Li:Bag of Tricks for Image Classification with Convolutional Neural Networks. CoRR abs/1812.01187 (2018) [PDF] 系统的介绍了许多训练CNN的trick
    • Zhang, Z., He, T., Zhang, H., Zhang, Z., Xie, J., & Li, M. (2019). Bag of Freebies for Training Object Detection Neural Networks. CoRR, abs/1902.04103.[PDF] 系统的介绍了目标检测过程中的许多tricks
    • (RePr) Prakash, A., Storer, J.A., Florêncio, D.A., & Zhang, C. (2018). RePr: Improved Training of Convolutional Filters. CoRR, abs/1811.07275.[PDF]
    • (WS) Weight Standardization Siyuan Qiao, Huiyu Wang, Chenxi Liu, Wei Shen, Alan Yuille [PDF]
    • Santurkar, S., Tsipras, D., Ilyas, A., & Madry, A. (2018). How Does Batch Normalization Help Optimization? NeurIPS.[PDF]
    • (DML) Zhang, Ying et al. “Deep Mutual Learning.” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018): 4320-4328. [PDF]
    • (Softer-NMS) He, Yihui et al. “Bounding Box Regression with Uncertainty for Accurate Object Detection.” (2019). [PDF]
    • (AutoAugment) Cubuk, E.D., Zoph, B., Mané, D., Vasudevan, V., & Le, Q.V. (2018). AutoAugment: Learning Augmentation Policies from Data. CoRR, abs/1805.09501.[PDF]
    • Zoph, Barret et al. “Learning Data Augmentation Strategies for Object Detection.” ArXiv abs/1906.11172 (2019): n. pag. [PDF]
    • Zhang, Haichao and Jianyu Wang. “Towards Adversarially Robust Object Detection.” (2019). [PDF]
    • (AlignDet) Chen, Yuntao et al. “Revisiting Feature Alignment for One-stage Object Detection.” (2019).[PDF]
    • (Attention Normalization) Li, Xilai, Wei Sun and Tianfu Wu. “Attentive Normalization.” (2019).[PDF]
    • (IoU-Balanced Loss) IoU-balanced Loss Functions for Single-stage Object Detection

    人脸检测

    • (SSH) Najibi, Mahyar et al. “SSH: Single Stage Headless Face Detector.” 2017 IEEE International Conference on Computer Vision (ICCV) (2017): 4885-4894. [PDF]
    • (S3S^3FD) Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., & Li, S.Z. (2017). S^3FD: Single Shot Scale-Invariant Face Detector. 2017 IEEE International Conference on Computer Vision (ICCV), 192-201. [PDF] 使用了锚框匹配策略和max-out增大了小尺寸人脸的召回率和假正例
    • (DSFD) Wang, Y., Wang, C., Tai, Y., Qian, J., Yang, J., Wang, C., Li, J., & Huang, F. (2018). DSFD: Dual Shot Face Detector. CoRR, abs/1810.10220. [PDF]
    • (PyramidBox++) Li, Zhihang et al. “PyramidBox++: High Performance Detector for Finding Tiny Face.” (2019). [PDF]
    • Zhang, F., Fan, X., Ai, G.P., Song, J., Qin, Y., & Wu, J. (2019). Accurate Face Detection for High Performance.[PDF]
    • (SRN) Chi, C., Zhang, S., Xing, J., Lei, Z., Li, S.Z., & Zou, X. (2019). Selective Refinement Network for High Performance Face Detection. CoRR, abs/1809.02693.[PDF]

    姿态估计

    • (DeepPose) Toshev, A., & Szegedy, C. (2014). DeepPose: Human Pose Estimation via Deep Neural Networks. 2014 IEEE Conference on Computer Vision and Pattern Recognition, 1653-1660.[PDF]

    图像分割

    GAN生成对抗网络

    • (GAN) Goodfellow, I.J. (2016). NIPS 2016 Tutorial: Generative Adversarial Networks. CoRR, abs/1701.00160. [PDF] 入门级经典论文,里面详细介绍了GAN的数学原理
    • (ConditionalGAN) Mirza, M., & Osindero, S. (2014). Conditional Generative Adversarial Nets. CoRR, abs/1411.1784.[PDF]
    • (DCGAN) Radford, A., Metz, L., & Chintala, S. (2016). Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. CoRR, abs/1511.06434. [PDF]
    • (WGAN) Arjovsky, Martín, Soumith Chintala and Léon Bottou. “Wasserstein GAN.” CoRR abs/1701.07875 (2017): n. pag.[PDF]
    • (WGAN-GP) Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., & Courville, A.C. (2017). Improved Training of Wasserstein GANs. NIPS. [PDF]
    • (BiGAN) Donahue, J., Krähenbühl, P., & Darrell, T. (2017). Adversarial Feature Learning. CoRR, abs/1605.09782.[PDF]
    • (CycleGAN) Zhu, Jun-Yan et al. “Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks.” 2017 IEEE International Conference on Computer Vision (ICCV) (2017): 2242-2251.[PDF]
    • Liu, Ming-Yu et al. “Unsupervised Image-to-Image Translation Networks.” NIPS (2017).[PDF]
    • (PG-GAN) Karras, T., Aila, T., Laine, S., & Lehtinen, J. (2018). Progressive Growing of GANs for Improved Quality, Stability, and Variation. CoRR, abs/1710.10196.[PDF]
    • (Sim-GAN) Shrivastava, Ashish et al. “Learning from Simulated and Unsupervised Images through Adversarial Training.” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017): 2242-2251. [PDF]
    • (Pixel-DA) Unsupervised Pixel–Level Domain Adaptation with Generative Adversarial Networks [PDF]
    • (Co-GAN) Liu, Ming-Yu and Oncel Tuzel. “Coupled Generative Adversarial Networks.” NIPS (2016).[PDF]
    • (DTN) Taigman, Yaniv et al. “Unsupervised Cross-Domain Image Generation.” CoRR abs/1611.02200 (2017): n. pag.[PDF]
    • (SN-GAN) Miyato, T., Kataoka, T., Koyama, M., & Yoshida, Y. (2018). Spectral Normalization for Generative Adversarial Networks. CoRR, abs/1802.05957.[PDF]
    • (Pizza-GAN) Papadopoulos, Dim P. et al. “How to make a pizza: Learning a compositional layer-based GAN model.” ArXiv abs/1906.02839 (2019): n. pag.[PDF]
    • (ENlightenGAN) Jiang, Yifan, Xinyu Gong, Ding Liu, Yu Cheng, Chen Fang, Xiaohui Shen, Jianchao Yang, Pan Zhou and Zhangyang Wang. “EnlightenGAN: Deep Light Enhancement without Paired Supervision.” (2019).[PDF]

    知识蒸馏

    • Hinton, Geoffrey E., Oriol Vinyals and Jeffrey Dean. “Distilling the Knowledge in a Neural Network.” CoRR abs/1503.02531 (2015): n. pag.[[PDF]](Hinton, Geoffrey E., Oriol Vinyals and Jeffrey Dean. “Distilling the Knowledge in a Neural Network.” CoRR abs/1503.02531 (2015): n. pag.)
    • (CCKD) Peng, Baoyun et al. “Correlation Congruence for Knowledge Distillation.” ArXiv abs/1904.01802 (2019): n. pag.[PDF]
    • Yuan, L., Tay, F.E., Li, G., Wang, T., & Feng, J. (2019). Revisit Knowledge Distillation: a Teacher-free Framework. ArXiv, abs/1909.11723.[PDF]
    • Tian, Y., Krishnan, D., & Isola, P. (2019). Contrastive Representation Distillation. ArXiv, abs/1910.10699.[PDF]
    • Liu, X., He, P., Chen, W., & Gao, J. (2019). Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding. ArXiv, abs/1904.09482.[PDF]
    • (BAM) Clark, K., Luong, M., Khandelwal, U., Manning, C.D., & Le, Q.V. (2019). BAM! Born-Again Multi-Task Networks for Natural Language Understanding. ACL.[PDF]

    NLP

    • (ELMo) Peters, M.E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep Contextualized Word Representations. ArXiv, abs/1802.05365.[PDF]
    • (GPT) Radford, Alec. “Improving Language Understanding by Generative Pre-Training.” (2018).[PDF]
    • (Transformer) Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I. (2017). Attention Is All You Need. NIPS. [PDF]
    • (BERT) Devlin, J., Chang, M., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv, abs/1810.04805.[PDF]
    • (Transformer-XL) Dai, Zihang et al. “Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context.” ArXiv abs/1901.02860 (2019): n. pag.[PDF]
    • (XLNet) Yang, Zhilin, Zihang Dai, Yiming Yang, Jaime G. Carbonell, Ruslan Salakhutdinov and Quoc V. Le. “XLNet: Generalized Autoregressive Pretraining for Language Understanding.” ArXiv abs/1906.08237 (2019): n. pag. [PDF]
    • (Region Embedding) Johnson, Rie and Tong Zhang. “Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding.” Advances in neural information processing systems 28 (2015): 919-927 . [PDF]
    • (DPCNN) Johnson, R., & Zhang, T. (2017). Deep Pyramid Convolutional Neural Networks for Text Categorization. ACL.[PDF]
    • (BERT Augmentation) Wu, Xing et al. “Conditional BERT Contextual Augmentation.” ArXiv abs/1812.06705 (2018): n. pag.[PDF]
    • (Few-shot Learning Induction Net) Geng, R., Li, B., Li, Y., Ye, Y., Jian, P., & Sun, J. (2019). Few-Shot Text Classification with Induction Network. ArXiv, abs/1902.10482.[PDF]
    • Zhang, Ye and Byron C. Wallace. “A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification.” IJCNLP (2015).[PDF]
    • (Attention Survey) Chaudhari, S., Polatkan, G., Ramanath, R., & Mithal, V. (2019). An Attentive Survey of Attention Models. ArXiv, abs/1904.02874.[PDF]
    • (TB-CNN) Short Text Classification Improved by Feature Space Extension [PDF]
    • (QA-Net) Yu, A.W., Dohan, D., Luong, M., Zhao, R., Chen, K., Norouzi, M., & Le, Q.V. (2018). QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension. ArXiv, abs/1804.09541.[PDF]
    • (BIDAF) Seo, M.J., Kembhavi, A., Farhadi, A., & Hajishirzi, H. (2016). Bidirectional Attention Flow for Machine Comprehension. ArXiv, abs/1611.01603.[PDF]
    • (CNN-Seq2Seq) Gehring, Jonas et al. “Convolutional Sequence to Sequence Learning.” ICML (2017).[PDF]
    • (PointNet) Vinyals, Oriol, Meire Fortunato, and Navdeep Jaitly. “Pointer networks.” Advances in Neural Information Processing Systems. 2015.[PDF] 可以很好地解决OOV ( out of vocabulary ) 问题
    • Liu, Shanshan et al. “Neural Machine Reading Comprehension: Methods and Trends.” ArXiv abs/1907.01118 (2019): n. pag.[PDF]
    • Hu, M., Wei, F., Peng, Y., Huang, Z., Lau, V.K., & Li, D. (2018). Read + Verify: Machine Reading Comprehension with Unanswerable Questions. ArXiv, abs/1808.05759.[PDF]
    • (U-Net) Sun, F., Li, L., Qiu, X., & Liu, Y.P. (2018). U-Net: Machine Reading Comprehension with Unanswerable Questions. ArXiv, abs/1810.06638.[PDF]
    • (MIX) Chen, H., Han, F.X., Niu, D., Liu, D., Lai, K., Wu, C., & Xu, Y. (2018). MIX: Multi-Channel Information Crossing for Text Matching. KDD.[PDF]
    • Rei, Marek and Anders Søgaard. “Jointly Learning to Label Sentences and Tokens.” AAAI (2018).[PDF]
    • Zhang, Xuchao, Fanglan Chen, Chang-Tien Lu and Naren Ramakrishnan. “Mitigating Uncertainty in Document Classification.” NAACL-HLT (2019).[PDF]
    • (SpanBERT) Joshi, M.S., Chen, D., Liu, Y., Weld, D.S., Zettlemoyer, L.S., & Levy, O. (2019). SpanBERT: Improving Pre-training by Representing and Predicting Spans. ArXiv, abs/1907.10529.[PDF]
    • (RoBERTa) Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M.S., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L.S., & Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. ArXiv, abs/1907.11692.[PDF]
    • (MT-DNN) Liu, X., He, P., Chen, W., & Gao, J. (2019). Multi-Task Deep Neural Networks for Natural Language Understanding. ACL.[PDF]
    • (ERNIE) Zhang, Zhengyan et al. “ERNIE: Enhanced Language Representation with Informative Entities.” ACL (2019).[PDF]
    • (ERNIE2.0) Sun, Yu, Shuohuan Wang, Yukun Li, Shikun Feng, Hao Tian, Hua Wu and Haifeng Wang. “ERNIE 2.0: A Continual Pre-training Framework for Language Understanding.” ArXiv abs/1907.12412 (2019): n. pag.[PDF]
    • (RE2) Yang, R., Zhang, J., Gao, X., Ji, F., & Chen, H. (2019). Simple and Effective Text Matching with Richer Alignment Features. ACL.[PDF]
    • (TinyBERT) Jiao, Xiaoqi et al. “TinyBERT: Distilling BERT for Natural Language Understanding.” ArXiv abs/1909.10351 (2019): n. pag.[PDF]
    • (RCNN) Choi, Keunwoo et al. “Convolutional recurrent neural networks for music classification.” 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2016): 2392-2396.[PDF]
    • (GEAR) Zhou, J., Han, X., Yang, C., Liu, Z., Wang, L., Li, C., & Sun, M. (2019). GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification. ACL.[PDF]
    • (CogNet) Ding, M., Zhou, C., Chen, Q., Yang, H., & Tang, J. (2019). Cognitive Graph for Multi-Hop Reading Comprehension at Scale. ACL.[PDF]
    • Tan, C., Wei, F., Wang, W., Lv, W., & Zhou, M. (2018). Multiway Attention Networks for Modeling Sentence Pairs. IJCAI.[PDF]
    • Deshmukh, Neil et al. “Semi-Supervised Natural Language Approach for Fine-Grained Classification of Medical Reports.” (2019).[PDF]
    • (ELECTRA) ELECTRA: PRE-TRAINING TEXT ENCODERS AS DISCRIMINATORS RATHER THAN GENERATORS [PDF]
    • (BART) Lewis, Mike, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer. “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension.” (2019).[PDF]
    • Nguyen, T.H., & Grishman, R. (2015). Relation Extraction: Perspective from Convolutional Neural Networks. VS@HLT-NAACL.[PDF]
    • Santos, Cícero Nogueira dos et al. “Classifying Relations by Ranking with Convolutional Neural Networks.” ACL (2015).[[PDF]](Santos, Cícero Nogueira dos et al. “Classifying Relations by Ranking with Convolutional Neural Networks.” ACL (2015).)
    • Wang, Linlin et al. “Relation Classification via Multi-Level Attention CNNs.” ACL (2016).[PDF]
    • Soares, Livio Baldini et al. “Matching the Blanks: Distributional Similarity for Relation Learning.” ACL (2019).[PDF]
    • Wu, Shanchan and Yifan He. “Enriching Pre-trained Language Model with Entity Information for Relation Classification.” CIKM (2019). [PDF]
    • Eberts, Markus and Adrian Ulges. “Span-based Joint Entity and Relation Extraction with Transformer Pre-training.” ArXiv abs/1909.07755 (2019): n. pag.[PDF]
    • Alt, Christoph et al. “Improving Relation Extraction by Pre-trained Language Representations.” ArXiv abs/1906.03088 (2019): n. pag.[PDF]
    • Xue, K., Zhou, Y., Ma, Z., Ruan, T., Zhang, H., & He, P. (2019). Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text. ArXiv, abs/1908.07721.[PDF]
    • (ColBERT) Khattab, O., & Zaharia, M. (2020). ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. ArXiv, abs/2004.12832.[PDF]
    • (Poly-encoders) Humeau, S., Shuster, K., Lachaux, M., & Weston, J. (2020). Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring. ICLR.[PDF]
    • Chang, W., Yu, F.X., Chang, Y., Yang, Y., & Kumar, S. (2020). Pre-training Tasks for Embedding-based Large-scale Retrieval. ArXiv, abs/2002.03932.[PDF]
    • Qiao, Y., Xiong, C., Liu, Z., & Liu, Z. (2019). Understanding the Behaviors of BERT in Ranking. ArXiv, abs/1904.07531.[PDF]
    • Wada, S., Takeda, T., Manabe, S., Konishi, S., Kamohara, J., & Matsumura, Y. (2020). A pre-training technique to localize medical BERT and enhance BioBERT.[PDF]
    • Guu, K., Lee, K., Tung, Z., Pasupat, P., & Chang, M. (2020). REALM: Retrieval-Augmented Language Model Pre-Training. ArXiv, abs/2002.08909.[PDF]
    • (PoWER-BERT) Goyal, S., Choudhury, A.R., Chakaravarthy, V.T., ManishRaje, S., Sabharwal, Y., & Verma, A. (2020). PoWER-BERT: Accelerating BERT inference for Classification Tasks. ArXiv, abs/2001.08950.[PDF]
    • Shen, S., Yao, Z., Gholami, A., Mahoney, M.W., & Keutzer, K. (2020). Rethinking Batch Normalization in Transformers. ArXiv, abs/2003.07845.[PDF]
    • (CheckList) Ribeiro, M.T., Wu, T., Guestrin, C., & Singh, S. (2020). Beyond Accuracy: Behavioral Testing of NLP models with CheckList. ACL.[PDF]
    • Diversifying Search Results using Self-Atention Network [PDF]
    • (FairCo) Morik, M., Singh, A., Hong, J., & Joachims, T. (2020). Controlling Fairness and Bias in Dynamic Learning-to-Rank. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.[PDF]

    (LSTM、RNN) 训练注意事项:

    1. 初始化begin state
    2. 梯度裁剪
    3. hidden state 截断时间反向传播 state.detach()

    Ranking

    • (LambdaRank) Learning to Rank with Nonsmooth Cost Functions
    • (LTR) From RankNet to LambdaRank to LambdaMART: An Overview [PDF]
    • (LambdaLoss) The LambdaLoss Framework for Ranking Metric Optimization
    • Understanding the Behaviors of BERT in Ranking
    • PASSAGE RE-RANKING WITH BERT
    • (DeepRank) Pang, Liang et al. “DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval.” Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (2017): n. pag.[PDF]
    • Nguyễn, T.V., Rao, N., & Subbian, K. (2020). Learning Robust Models for e-Commerce Product Search. ArXiv, abs/2005.03624.[PDF]

    知识图谱

    • Wang, Quan et al. “Knowledge Graph Embedding: A Survey of Approaches and Applications.” IEEE Transactions on Knowledge and Data Engineering 29 (2017): 2724-2743.[PDF]
    • (FastText) Joulin, A., Grave, E., Bojanowski, P., Nickel, M., & Mikolov, T. (2017). Fast Linear Model for Knowledge Graph Embeddings. ArXiv, abs/1710.10881.[PDF]
    • Moussallem, Diego et al. “Augmenting Neural Machine Translation with Knowledge Graphs.” ArXiv abs/1902.08816 (2019): n. pag.[PDF]
    • (ConvKB) Nguyen, Dai Quoc et al. “A Novel Embedding Model for Knowledge Base Completion Based on Convolutional Neural Network.” NAACL-HLT (2017).[PDF]
    • Cheng, H., Koc, L., Harmsen, J., Shaked, T., Chandra, T., Aradhye, H., Anderson, G., Corrado, G.S., Chai, W., Ispir, M., Anil, R., Haque, Z., Hong, L., Jain, V., Liu, X., & Shah, H. (2016). Wide & Deep Learning for Recommender Systems. DLRS@RecSys.[PDF]
    • Yang, B., Yih, W., He, X., Gao, J., & Deng, L. (2015). Embedding Entities and Relations for Learning and Inference in Knowledge Bases. CoRR, abs/1412.6575.[PDF]
    • (Survey) Ji, S., Pan, S., Cambria, E., Marttinen, P., & Yu, P.S. (2020). A Survey on Knowledge Graphs: Representation, Acquisition and Applications. ArXiv, abs/2002.00388.[[PDF]](Ji, S., Pan, S., Cambria, E., Marttinen, P., & Yu, P.S. (2020). A Survey on Knowledge Graphs: Representation, Acquisition and Applications. ArXiv, abs/2002.00388.)
    • Wang, Q., Mao, Z., Wang, B., & Guo, L. (2017). Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Transactions on Knowledge and Data Engineering, 29, 2724-2743.[PDF]
    • Nguyen, D.Q. (2017). An overview of embedding models of entities and relationships for knowledge base completion. ArXiv, abs/1703.08098.[PDF]
    • (RippleNet) Wang, H., Zhang, F., Wang, J., Zhao, M., Li, W., Xie, X., & Guo, M. (2018). RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems. Proceedings of the 27th ACM International Conference on Information and Knowledge Management.[PDF]

    图卷积神经网络

    • (DeepGCNs) Li, G., Muller, M., Thabet, A., & Ghanem, B. (2019). DeepGCNs: Can GCNs Go as Deep as CNNs?[PDF]
    • (SGN) Wu, Felix, Tianyi Zhang, Amauri H. de Souza, Christopher Fifty, Tao Yu and Kilian Q. Weinberger. “Simplifying Graph Convolutional Networks.” ICML (2019). [PDF]
    • (PNA) Corso, G., Cavalleri, L., Beaini, D., Lió, P., & Velickovic, P. (2020). Principal Neighbourhood Aggregation for Graph Nets. ArXiv, abs/2004.05718.[PDF]

    参考博客:https://blog.csdn.net/qq_21190081/article/details/69564634

    展开全文
  • 深度学习论文:RCNN

    2017-05-25 08:41:57
    深度学习论文:RCNN
  • 深度学习论文合集

    2018-09-27 14:49:38
    该资源汇总了从Google学术上下载的,有关深度学习的77篇论文,包括CNN提出,LeNet,VGGNet,AlexNet,ResNet,RNN,以及迁移学习等等比较多的论文,省去自己到处下论文的繁琐。
  • 深度学习论文汇总论文汇总经典必读最新进展 论文汇总 将自己学习的论文进行汇总和分析,从而更清楚地把握整体论文脉络(持更) 经典必读 原博客链接: link. ImageNet Classification with Deep ...
  • 深度学习论文集百篇

    2018-05-15 16:47:43
    一百篇深度学习论文,一百篇深度学习论文,一百篇深度学习论文
  • 转载自知乎:深度学习论文阅读路线图 Deep Learning Papers Reading Roadmap。   1 前言 相信很多想入门深度学习的朋友都会遇到这个问题,就是应该看哪些论文。包括我自己,也是花费了大量的时间在寻找文章上。...
  • 在医学图像分析方向上深度学习论文清单

    千次阅读 多人点赞 2019-01-23 10:46:51
    在医学图像分析方向上深度学习论文清单 背景 据我们所知,这是第一份关于医学应用的深度学习论文清单。一般来说,有很多深度学习论文或计算机视觉的列表,例如Awesome Deep Learning Papers。在此列表中,我尝试根据...
  • 列表选自github用户floodsung发起的Deep Learning Papers Reading Roadmap项目,主要目标是收集深度学习论文的阅读路径图,并随着领域的发展而不断更新。   这一阅读清单的收集,主要关注于以下四个方面: 从...
  • 【更新于12.29】深度学习论文汇总

    万次阅读 多人点赞 2017-04-07 19:51:34
    本博客用于记录自己平时收集的一些不错的深度学习论文,近9成的文章都是引用量3位数以上的论文,剩下少部分来自个人喜好,本博客将伴随着我的研究生涯长期更新,如有错误或者推荐文章烦请私信。深度学习书籍和入门...
  • DEEPLEARNING.UNIVERSITY是memkite小组维护的深度学习论文库,目前已收录2014年至今的963篇经过分类的深度学习论文,不仅涵盖了机器学习众多子领域,还包含了深度学习在金融、医药、生物等领域的应用,论文种类非常...
  • [转载]深度学习论文笔记:OverFeat 原文转自:http://blog.csdn.net/chenli2010/article/details/25204241  今天我们要谈论的文章为: OverFeat: Integrated Recognition, Localization and ...
  • 深度学习论文翻译

    千次阅读 多人点赞 2018-01-25 21:29:53
    Deep Learning Papers Translation Github地址:... Image Classification AlexNet ImageNet Classification with Deep Convolutional Neural Netwo...

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 11,552
精华内容 4,620
关键字:

深度学习论文