Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning
We frame static PE malware evasion as a reinforcement learning problem and train agents that learn sequences of non-breaking modifications (section padding, import injection, header tweaks) that flip a classifier’s decision while preserving binary functionality.
arXiv:1801.08917 · Code: MalwareRL