Hands-On Generative Adversarial Networks with Keras
上QQ阅读APP看书,第一时间看更新

Backpropagation

The backpropagation algorithm is a special case of reverse-mode automatic differentiation. In its basic modern version, the backpropagation algorithm has become the standard for training neural networks, possibly due to its underlying simplicity and relative power.

Inspired by the work of Donal Hebb and the so-called Hebb rule, Rosenblatt developed the idea of a perceptron that was based on the formation and changes of synapses between neurons, where the output of a neuron will be modeled as a weighted sum based on its incoming signals. This weighted sum is similar to what we described in the equation(3) in this chapter.

The basic idea of backpropagation is as follows: to define an error function and reiteratively compute the gradients of the loss with respect to the model weights and perform gradient descent and weight updates that are optimal for minimizing the error function given the current data and model weights. From a perspective of calculus, backpropagation is an algorithm that efficiently computes the chain rule with a specific order of operations.

The data and the error function are extremely important factors of the learning procedure. For example, consider a classifier that is trained to identify single handwritten digits. For training the model, suppose we will use the MNIST dataset, (http://yann.lecun.com/exdb/mnist/) which is comprises 28 by 28 monochromatic images of single handwritten white digits on a black canvas, such as the ones in following figure. Ideally, we want the model to predict 1 whenever the image looks such as a 1, 2 whenever it looks such as a 2, and so on:

Note that backpropagation does not define what aspects of the images should be considered, nor does it provide guarantees that the classifier will not memorize the data or that it will get generalize to unseen examples. A model trained with such data will fail if the colors are inverted, for instance, black digits on a white canvas. Naturally, one way to fix this issue will be to augment the data to include images with any foreground and background color combination. This example illustrates the importance of the training data and the objective function being used.