This is the implementation of "Training deep neural networks via direct loss minimization" published at ICML 2016 in PyTorch. The implementation targets the 0-1 loss.
The repository consists of 3 script files:
main.py
: a demonstration to train MNIST with 0-1 lossConvNet.py
: a class defining the architecture of the model usedutils.py
: consists of the function used to estimate the gradient.
One can run the demonstration in main.py
by copying and modifying (e.g. location to save checkpoints) the command at the top of the script. Here are the results I got when training on MNIST for 100 epochs.
Figure 2. Testing results evaluated at each epoch: (top) cross-entropy loss, and (bottom) prediction accuracy.
If you want to estimate the gradient of 0-1 loss and integrate into your code, please import the grad_estimation
function in utils.py
.