An implementation and analysis of PGD adversarial attacks on image classifiers. By iteratively perturbing inputs within a bounded ε-ball, the attack reliably fools undefended models. Training models on these attack-generated examples produces models robust to PGD with minimal clean accuracy sacrifice.
Technologies
- Pytorch
- High Performance Computing
- SLURM