Argmax and SoftMax

1 min readJul 27, 2021

This article explains how the Argmax and SoftMax functions work.

The output values obtained by a neural network from its various output nodes are not always in the range of 0 to 1, and can be greater than 1 or less than 0. These dynamic values can degrade our machine’s learning power and cause it to misbehave. The Argmax and SoftMax functions are used to obtain values between 0 and 1.

The Argmax function interprets the largest positive output value as 1 and all other values as 0, making the model too concrete for a few values. This function is useful for testing because we only need to check the final prediction and not the relationship between different inputs and different outputs/labels.

SoftMax function uses the formulae

and by this formula, The predicting power/probability for each output value is calculated and ranges from 0 to 1. The output value with the highest predicting probability is assumed as the final prediction, but all other smaller predicting probabilities are preserved to preserve the relationship between the inputs and outputs. As a result, this function is useful for training.

I hope I was able to explain the Argmax and SoftMax functions in the simplest terms possible. Thank you very much!

Argmax and SoftMax

Written by Shashwat Agarwal

No responses yet