NIPS2018论文及代码集锦(12)(亮点: CNN正交正则;循环控制循环网络)

2019年7月6日 756点热度 1人点赞 0条评论

[1] The streaming rollout of deep networks - towards fully model-parallel execution

Volker Fischer, Jan Köhler, Thomas Pfeil

Bosch Center for Artificial Intelligence

深层神经网络，尤其是循环网络，在控制与外界实时交互的自治体方面很有前景。然而，这需要将时序特征无缝地集成到网络结构中。针对循环神经网络，训练和推理时通常在时间上进行展开，但是展开的方法有多种。

一般在推理过程中，网络的每一层是以序列的形式进行计算的，这就会造成信息集成在时序上是稀疏的，而且响应时间较长。

这篇文章针对如何展开，提出了一种理论框架，不同展开方式对应的模型并行度会有所不同。有些特定形式的展开，可以得到更早的并且更频繁的响应，而且早些的响应通常会取得更好的效果。流式展开性能最好，这种展开能够使得网络完全并行化执行，进而利用大量并行设备缩短运行时间。

不同的展开方式对比如下

这篇文章的主要贡献可以归纳为

不同的展开方式及对应的更新和推理对比如下

代码地址

https://github.com/boschresearch/statestream

[2] Can We Gain More from Orthogonality Regularizations in Training Deep CNNs?

Nitin Bansal, Xiaohan Chen, Zhangyang Wang

Texas A&M University

立即下载

这篇文章为了更好的训练深层卷积神经网络，提出了新颖的正交正则项，该正则项利用了多种高级分析工具，比如相干性和有限等距特性。这些插拔式正则项可以很方便地用于训练任何卷积神经网络。

作者们将这些方法用于一些效果较好的经典模型，比如ResNet, WideResNet, ResNeXt等，并且在流行的计算机视觉数据集上进行了实验，比如CIFAR-10, CIFAR-100, SVHN以及ImageNet等。

这篇文章提出的正则化方法不仅能够使得模型训练速度更快，而且收敛地更稳定，还可以达到相当的准确率。

几种模型在两个数据集上利用不同正则方法的效果对比如下

其中ResNet-110对应的论文为

Deep residual learning for image recognition. CVPR 2016

代码地址

https://github.com/KaimingHe/deep-residual-networks

Wide ResNet 28-10 对应的论文为

Wide residual networks, BMVC 2016

代码地址

https://github.com/szagoruyko/wide-residual-networks

https://github.com/xternalz/WideResNet-pytorch

ResNext 29-8-64对应的论文为

Aggregated residual transformations for deep neural networks, CVPR 2017

代码地址

https://github.com/facebookresearch/ResNeXt

几种正则方法分别对应

SO: Soft Orthogonality

DSO: Double Soft Orthogonality

MC: Mutual Coherence

SRIP: Spectral Restricted Isometry Property

ImageNet数据集上几种方法的效果对比如下

其中ResNet 34和ResNet 50对应的论文为

Deep residual learning for image recognition. CVPR 2016

代码地址

https://github.com/KaimingHe/deep-residual-networks

Pre-Resnet 34对应的论文为

Identity mappings in deep residual networks, ECCV 2016

代码地址

https://github.com/FlorianMuellerklein/Identity-Mapping-ResNet-Lasagne

OMDSM为Optimization over Multiple Dependent Stiefel Manifolds

对应的论文为

orthogonal weight normalization method for solving orthogonality constraints over Steifel manifold in deep neural networks, AAAI 2018

代码地址

https://github.com/huangleiBuaa/OthogonalWN

代码地址

https://github.com/nbansal90/Can-we-Gain-More-from-Orthogonality

[3] Recurrently Controlled Recurrent Networks

Yi Tay, Luu Anh Tuan, and Siu Cheung Hui

Nanyang Technological University, Institute for Infocomm Research

立即下载

在循环神经网络中，长短时记忆网络以及门限循环单元在很多时序建模中比较常用，是比较经典的构建单元。

这篇文章提出一种循环控制循环网络，该网络能够得到表达力更强的序列编码。这种网络的核心思想是利用循环网络来学习循环门限函数。整体结构可以分成两个部分，一个控制单元和一个接收单元，其中循环控制单元主动影响接收单元的组合。

该方法可以用于情感分析，比如SST， IMDb，Amzon reviews这些数据集，问题分类，比如TREC数据集，蕴含分类，比如SNLI，SciTail数据集，答案选取问题，比如WikiQA， TrecQA数据集，以及阅读理解问题，比如NarrativeQA数据集。

基于26个数据集的实验表明，本文所提出的RCRN不仅优于BiLSTMs，而且优于堆叠BiLSTMs。

本文所提出的结构概览如下

在Amazon Review数据集上几种方法的效果对比如下

在情感分析数据集上集中方法的效果对比如下

其中MVN对应的论文为

End-to-end multi-view networks for text classification, 2017

DiSAN对应的论文为

Disan: Directional self-attention network for rnn/cnn-free language understanding, AAAI 2018

代码地址

https://github.com/taoshen58/DiSAN

DMN对应的论文为

Ask me anything: Dynamic memory networks for natural language processing,ICML 2016

代码地址

https://github.com/YerevaNN/Dynamic-memory-networks-in-Theano

https://github.com/DongjunLee/dmn-tensorflow

LSTM-CNN对应的论文为

Text classification improved by integrating bidirectional lstm with two-dimensional max pooling

NTI 对应的论文为

Neural tree indexers for text understanding

BCN对应的论文为

Learned in translation: Contextualized word vectors, NIPS 2017

BCN+ELMo对应的论文为

Deep contextualized word representations

P-LSTM 对应的论文为

Towards universal paraphrastic sentence embeddings， ICLR 2016

代码地址

https://github.com/jwieting/iclr2016

CT-LSTM对应的论文为

Deep learning with dynamic computation graphs

代码地址

https://github.com/tensorflow/fold

TE-LSTM对应的论文为

Encoding syntactic knowledge in neural networks for sentiment classification， TOIS 2017

NSE对应的论文为

Neural semantic encoders

代码地址

https://github.com/pdasigi/neural-semantic-encoders

BMLSTM对应的论文为

Learning to generate reviews and discovering sentiment, 2017

代码地址

https://github.com/openai/generating-reviews-discovering-sentiment

在数据集IMDb和TREC上的效果对比如下

其中Res. BiLSTM对应的论文为

A way out of the odyssey: Analyzing and combining recent insights for lstms

4L-QRNN对应的论文为

Quasi-recurrent neural networks

代码地址

https://github.com/salesforce/pytorch-qrnn

oh-LSTM对应的论文为

Supervised and semi-supervised text categorization using lstm for region embeddings

TRNN对应的论文为

Topicrnn: A recurrent neural network with long-range semantic dependency

代码地址

https://github.com/dangitstam/topic-rnn

virtual对应的论文为

Adversarial training methods for semi-supervised text classification

代码地址

https://github.com/aonotas/adversarial_text

CNN-MC对应的论文为

Convolutional neural networks for sentence classification，emnlp 2014

代码地址

https://github.com/yoonkim/CNN_sentence

SRU对应的论文为

Training rnns as fast as cnns

代码地址

https://github.com/taolei87/sru

DSCNN对应的论文为

Dependency sensitive convolutional neural networks for modeling sentences and documents

DC-BiLSTM对应的论文为

Densely connected bidirectional lstm with applications to sentence classification

代码地址

https://github.com/IsaacChanghau/Dense_BiLSTM

在数据集SNLI和SciTail上几种方法的效果对比如下

其中Multi-head对应的论文为

Attention is all you need，NIPS 2017

代码地址

https://github.com/jadore801120/attention-is-all-you-need-pytorch

https://github.com/Kyubyong/transformer

https://github.com/Lsdefine/attention-is-all-you-need-keras

shortcut对应的论文为

Shortcut-stacked sentence encoders for multi-domain inference

Gumbel LSTM对应的论文为

Unsupervised learning of task-specific tree structures with tree-lstms

Dynamic Meta Emb对应的论文为

Context-attentive embeddings for improved sentence representations

ESIM对应的论文为

Enhanced LSTM for natural language inference，ACL 2017

代码地址

https://github.com/lukecq1231/nli

DecompAtt对应的论文为

A decomposable attention model for natural language inference, EMNLP2016

代码地址

https://github.com/erickrf/multiffn-nli

DGEM对应的论文为

Scitail: A textual entailment dataset from science question answering，aaai2018

代码地址

https://github.com/allenai/scitail

CAFE对应的论文为

A compare-propagate architecture with alignment factorization for natural language inference

CSRAN对应的论文为

Co-stack residual affinity networks with multi-level attention refinement for matching text sequences，EMNLP 2018

代码地址

https://github.com/vanzytay/EMNLP2018_NLI

OpenAI GPT对应的论文为

Improving language understanding by generative pre-training

代码地址

https://github.com/openai/finetune-transformer-lm

几种方法在答案检索和阅读理解中的效果对比如下

几种方法耗时对比如下

代码地址

https://github.com/vanzytay/NIPS2018_RCRN

本作品采用知识共享署名 4.0 国际许可协议进行许可

NIPS2018论文及代码集锦(12)(亮点: CNN正交正则;循环控制循环网络)

文章评论