Cost Function

We must create a cost function, with which the softmax probability and one-hot encoded target vector must be compared to determine the similarity. For this purpose, we employed the conception of cross-entropy. Cross-entropy is a distance computation function that uses the softmax function's estimated probability and the one-hot-encoding matrix to determine the distance. The distance values for the correct target classes are lower, while the distance values for the incorrect target classes are greater. Passing an input through the model and comparing the predictions to ground-truth labels are the means by which a neural network is trained. A loss function is used to make this comparison. Categorical cross-entropy loss is the loss function of choice for multiclass classification issues. It does, however, necessitate the one-hot encoding of the labels. Sparse categorical cross-entropy loss may be a useful option in this instance. The loss function offers the same type of loss as categorical cross-entropy loss but on integer targets rather than one-hot encoded targets. This eliminates the categorical step that is so prevalent in TensorFlow/Keras models. In artificial neural networks, the softmax function is employed in a variety of multiclass classification algorithms. The outcome of K's unique linear functions is used as the input for the multinomial logistic regression and linear discriminant analysis, and the predicted probability is calculated by Equation (9):

$$\mathbf{P(y = j | \mathbf{x})} = \frac{\mathbf{e^{\mathbf{x}^T \mathbf{w}\_j}}}{\sum\_{k=1}^K \mathbf{e^{\mathbf{x}^T \mathbf{w}\_k}}} \tag{9}$$

with the jth class given as the x sample vector and w weighting vector.
