# How to implement a neural network Intermezzo 2

This page is part of a 5 (+2) parts tutorial on how to implement a simple neural network model. You can find the links to the rest of the tutorial here:

## Softmax classification function

This intermezzo will cover:

The previous intermezzo described how to do a classification of 2 classes with the help of the logistic function . For multiclass classification there exists an extension of this logistic function called the softmax function which is used in multinomial logistic regression . The following section will explain the softmax function and how to derive it.

In [1]:

## Softmax function

The logistic output function described in the previous intermezzo can only be used for the classification between two target classes t=1

and t=0. This logistic function can be generalized to output a multiclass categorical probability distribution by the softmax function . This softmax function ς takes as input a C-dimensional vector z and outputs a C-dimensional vector y of real values between 0 and 1

. This function is a normalized exponential and is defined as:

yc=ς(z)c=ezcCd=1ezdforc=1C

The denominator Cd=1ezd

acts as a regularizer to make sure that Cc=1yc=1. As the output layer of a neural network, the softmax function can be represented graphically as a layer with C

neurons.

We can write the probabilities that the class is t=c

for c=1C given input z

as:

⎡⎣⎢⎢⎢P(t=1|z)P(t=C|z)⎤⎦⎥⎥⎥=⎡⎣⎢⎢⎢ς(z)1ς(z)C⎤⎦⎥⎥⎥=1Cd=1ezd⎡⎣⎢⎢⎢ez1ezC⎤⎦⎥⎥⎥

Where P(t=c|z)

is thus the probability that that the class is c given the input z

.

These probabilities of the output P(t=1|z)

for an example system with 2 classes (t=1, t=2) and input z=[z1,z2] is shown in the figure below. The other probability P(t=2|z)

will be complementary.

In [2]:
In [3]: