Best of arXiv.org for AI, Machine Learning, and Deep Learning – May 2018

Print Friendly, PDF & Email

In this recurring monthly feature, we filter recent research papers appearing on the arXiv.org preprint server for compelling subjects relating to AI, machine learning and deep learning – from disciplines including statistics, mathematics and computer science – and provide you with a useful “best of” list for the past month. Researchers from all over the world contribute to this repository as a prelude to the peer review process for publication in traditional journals. arXiv contains a veritable treasure trove of learning methods you may use one day in the solution of data science problems. We hope to save you some time by picking out articles that represent the most promise for the typical data scientist. The articles listed below represent a fraction of all articles appearing on the preprint server. They are listed in no particular order with a link to each paper along with a brief overview. Especially relevant articles are marked with a “thumbs up” icon. Consider that these are academic research papers, typically geared toward graduate students, post docs, and seasoned professionals. They generally contain a high degree of mathematics so be prepared. Enjoy!

geomstats: a Python Package for Riemannian Geometry in Machine Learning

There is a growing interest in using Riemannian geometry in machine learning. This paper introduces geomstats, a python package that performs computations on manifolds such as hyperspheres, hyperbolic spaces, spaces of symmetric positive definite matrices and Lie groups of transformations. Also provided is efficient and extensively unit-tested implementations of these manifolds, together with useful Riemannian metrics and associated Exponential and Logarithm maps. The corresponding geodesic distances provide a range of intuitive choices of Machine Learning loss functions. The authors also give the corresponding Riemannian gradients. The operations implemented in geomstats are available with different computing backends such as numpy, tensorflow and keras. The authors have enabled GPU implementation and integrated geomstats manifold computations into keras deep learning framework. This paper also presents a review of manifolds in machine learning and an overview of the geomstats package with examples demonstrating its use for efficient and user-friendly Riemannian geometry.

iVQA: Inverse Visual Question Answering

This paper proposes the inverse problem of Visual question answering (iVQA), and explore its suitability as a benchmark for visuo-linguistic understanding. The iVQA task is to generate a question that corresponds to a given image and answer pair. Since the answers are less informative than the questions, and the questions have less learnable bias, an iVQA model needs to better understand the image to be successful than a VQA model. The authors pose question generation as a multi-modal dynamic inference process and propose an iVQA model that can gradually adjust its focus of attention guided by both a partially generated question and the answer. For evaluation, apart from existing linguistic metrics, we propose a new ranking metric. This metric compares the ground truth question’s rank among a list of distractors, which allows the drawbacks of different algorithms and sources of error to be studied. Experimental results show that this model can generate diverse, grammatically correct and content correlated questions that match the given answer.

A Spline Theory of Deep Networks

This paper builds a rigorous bridge between deep networks (DNs) and approximation theory via spline functions and operators. The key result is that a large class of DNs can be written as a composition of max-affine spline operators (MASOs), which provide a powerful portal through which to view and analyze their inner workings. For instance, conditioned on the input signal, the output of a MASO DN can be written as a simple affine transformation of the input. This implies that a DN constructs a set of signal-dependent, class-specific templates against which the signal is compared via a simple inner product; we explore the links to the classical theory of optimal classification via matched filters and the effects of data memorization.

The unreasonable effectiveness of the forget gate

Given the success of the gated recurrent unit, a natural question is whether all the gates of the long short-term memory (LSTM) network are necessary. Previous research has shown that the forget gate is one of the most important gates in the LSTM. This paper shows that a forget-gate-only version of the LSTM with chrono-initialized biases, not only provides computational savings but outperforms the standard LSTM on multiple benchmark datasets and competes with some of the best contemporary models. The proposed network, the JANET, achieves accuracies of 99% and 92.5% on the MNIST and pMNIST data sets, outperforming the standard LSTM which yields accuracies of 98.5% and 91%.

Spectral Normalization for Generative Adversarial Networks

One of the challenges in the study of generative adversarial networks is the instability of its training. This paper proposes a novel weight normalization technique called spectral normalization to stabilize the training of the discriminator. A new normalization technique is computationally light and easy to incorporate into existing implementations. The authors tested the efficacy of spectral normalization on CIFAR10, STL-10, and ILSVRC2012 data set, and experimentally confirmed that spectrally normalized GANs (SN-GANs) is capable of generating images of better or equal quality relative to the previous training stabilization techniques.

 

Sign up for the free insideBIGDATA newsletter.

Speak Your Mind

*