F3arwin

$$\theta_t+1 = \theta_t - \eta \nabla_\theta \frac1\mathcalP \textadv \sum \delta \in \mathcalP \textadv L(f \theta(x+\delta), y)$$

f3arwin significantly outperforms prior genetic attacks due to adaptive mutation and SBX crossover, which preserves high-fitness perturbation structures. Compared to Square Attack, f3arwin requires 11% fewer queries for a similar ASR. On VGG-16 (unseen during attack generation), f3arwin perturbations crafted on ResNet-50 achieved 68.3% ASR, vs. 51.2% for Square Attack and 59.7% for standard genetic attack. This suggests that evolutionary perturbations capture more model-agnostic features. 5.3 Defensive Robustness | Defense Method | Clean Acc. | Robust Acc. (PGD) | Robust Acc. (f3arwin attack) | |----------------|------------|------------------|-------------------------------| | Standard | 92.1% | 0.3% | 0.1% | | PGD-AT | 88.4% | 51.2% | 43.5% | | TRADES | 87.9% | 53.1% | 46.2% | | f3arwin defense | 89.2% | 54.8% | 58.9% |

Author: (Generated for academic demonstration) Affiliation: AI Robustness Lab Date: April 17, 2026 Abstract The vulnerability of deep neural networks (DNNs) to adversarial examples—inputs perturbed imperceptibly to induce misclassification—remains a critical challenge for deploying AI in security-sensitive domains. Existing defense mechanisms, such as adversarial training, often rely on static threat models or gradient-based attacks, which can be circumvented by black-box or evolutionary search methods. This paper introduces f3arwin (Fast Flexible Evolutionary Framework for Adversarial Robustness Without Input Normalization), a novel framework that leverages genetic algorithms (GAs) to generate diverse, transferable adversarial perturbations and simultaneously harden DNNs against them. Unlike gradient-based approaches, f3arwin operates in a black-box setting, requires no differentiability of the target model, and adapts its mutation and crossover operators dynamically. We evaluate f3arwin on CIFAR-10 and ImageNet subsets, achieving a success rate of 94.2% against undefended ResNet-50 models and improving adversarial robustness by 37% after evolutionary defensive distillation. The results demonstrate that evolutionary robustness strategies offer a complementary, query-efficient alternative to gradient-based defenses. 1. Introduction Adversarial examples exploit the linearity and non-robust features of DNNs (Goodfellow et al., 2015; Ilyas et al., 2019). While gradient-based attacks (e.g., FGSM, PGD) are common, they assume white-box access and differentiable loss surfaces. Real-world systems often obscure gradients, and defenses like gradient masking can thwart these attacks. Evolutionary algorithms (EAs) require only final model outputs (scores or labels), making them ideal for black-box adversarial generation. f3arwin

[5] Su, J., Vargas, D. V., & Sakurai, K. (2018). One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation .

[3] Ilyas, A., Engstrom, L., Athalye, A., & Lin, J. (2019). Black-box adversarial attacks with limited queries and information. ICML . | Robust Acc

[6] Zhang, H., Yu, Y., Jiao, J., Xing, E. P., Ghaoui, L. E., & Jordan, M. I. (2019). Theoretically principled trade-off between robustness and accuracy. ICML .

[4] Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. ICLR . M. I. (2019).

$$F(\delta) = \underbrace\mathbbI[f_\theta(x+\delta) \neq y] \cdot (1 - \textsoftmax(f_\theta(x+\delta)) y) \textMisclassification confidence - \lambda \cdot \frac\epsilon \sqrtd$$

f3arwin defense yields against its own evolutionary attack compared to PGD-AT, and also generalizes better to PGD (54.8% vs 51.2%). This demonstrates that co-evolving attacks and defenses leads to a more balanced robustness. 5.4 Query Efficiency over Generations f3arwin converges to successful adversarial examples in a median of 38 generations (≈ 2280 queries) compared to 68 generations for standard genetic attack. The adaptive mutation rate prevents premature convergence and reduces wasted queries on low-fitness regions. 6. Discussion Why does evolution help robustness? Standard adversarial training uses a fixed attack method, creating a "gradient-aligned" robust region. Evolutionary attacks explore non-gradient directions, revealing vulnerabilities that gradient-based methods miss. f3arwin defense then closes these gaps, producing a model robust to a wider class of perturbations.