Stanford DawnBench - CIFAR10
Goal: Achieve 94% validation accuracy on CIFAR10 Dataset in less than 100 secs on V100 GPU.
Challenges:
- Due to lack of infrastructure, google colab was the only option available. To simulate the same environment, target is 94% validation accuracy in 600 secs on K80
Experiments:
Experiment 1: Trained a custom made ResNet9 Model using tensorflow.keras.
Results: Validation Accuracy: 92.2% Time: 1493 secs
Experiment 2: Added a slanted one cycle Learning rate with gradual drop towards the end.
Results: Validation Accuracy: 92.6% Time: 1502 secs
Experiment 3: Added Image Augmentation: FlipLR, RandomPadCrop(padding of 4) and Cutout(16x16).
Results: Validation Accuracy: 93.8% Time: 1627 secs
Experiment 4: Built a pipeline using tfRecords and enabled prefetch for CPU and GPU to work in parallel.
Results: Validation Accuracy: 93.8% Time: 741 secs
Experiment 5: Augmented the data and then stored in tfRecords.
Results: Validation Accuracy: 93.8% Time: 602 secs
Details:
- Batch size: 512
- Total Parameters: 8.9M (check params)
- Learning Rate: MaxLR = 0.4 at epoch 5, MinLR=0.001 at epoch 20 and gradual drop 0.0001 at 24th epoch
Results:
- Validation Accuracy 93.80%
- Time: 602 secs
- Epochs 24