Deep Learning Made Easier by Linear Transformations in Perceptrons

Submitted by sohaseth on May 30, 2012 - 13:02

Lecturer :

Tapani Raiko

Event type:

HIIT seminar

Doctoral dissertation

Respondent:

Opponent:

Custos:

Event time:

2012-06-18 13:15 to 14:00

Place:

Description:

Abstract
We transform the outputs of each hidden neuron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. This transformation aims at separating the problems of learning the linear and nonlinear parts of the whole input-output mapping, which has many benefits. We study the theoretical properties of the transformation by noting that they make the Fisher information matrix closer to a diagonal matrix, and thus standard gradient closer to the natural gradient. We experimentally confirm the usefulness of the transformations by noting that they make basic stochastic gradient learning competitive with state-of-the-art learning algorithms in speed, and that they seem also to help find solutions that generalize better. The experiments include both classification of small images and learning a lowdimensional representation for images by using a deep unsupervised auto-encoder network. The transformations were beneficial in all cases, with and without regularization and with networks from two to five hidden layers.

joint work with Harri Valpola and Yann LeCun

Bio: Tapani Raiko is currently a post-doctoal fellow at the Department of Information and Computer Science at the Aalto University.

Last updated on 30 May 2012 by Sohan Seth - Page created on 30 May 2012 by Sohan Seth

HIIT

Otaniemi

T building: Aalto University, Otaniemi campus, Computer Science building, Konemiehentie 2, 02150 Espoo.

Get to Otaniemi site by public transport

Kumpula

Exactum building: University of Helsinki, Kumpula campus, Gustaf Hällströmin katu 2b, 00560 Helsinki

Get to Kumpula site by public transport

More contact information

Tweets by @HIIT

Search form

Search form

Deep Learning Made Easier by Linear Transformations in Perceptrons

News & events

HIIT

Otaniemi

Kumpula