Pasar al contenido principal

Repeated Stackelberg security games: Learning with incomplete state information

Autor/es Anáhuac
Guillermo Alcántara-Jiménez
Año de publicación
2020
Journal o Editorial
Reliability Engineering and System Safety

Abstract 
Existing applications of Stackelberg Security Games (SSGs) have make use of Reinforcement Learning (RL) approaches to learn and adapt defenders-attackers behavior. The learning process for defenders-attackers is represented by randomized strategies for the defenders applied against adversarial strategies of the attackers, which acquire feedback on their strategies observing the target that was defended-attacked. However, must of the existing SSGs RL models feature strong assumptions including that the defenders and attackers have perfect information about the behavioral model, producing inconsistencies.
We address these problems proposing a practical framework for representing real-world security problems by empowering SSGs with a RL approach considering incomplete state information. The players’ behavior and rationality are restricted to a class of partially observed Markov games (POMG). We develop an algorithm that consider randomized strategies for both defenders and attackers and obtain feedback on their partially observed states. We propose adaptive rules for computing the estimated transition matrices and utilities considering the number of unobserved experiences in the game. Furthermore, we study the problems of convergence of the estimated transition matrices and utilities in SSGs. For the realization of the SSG, we propose a new partially observed random walk technique for the randomization in the scheduling of the patrol planning. Results are applied to security games between defenders and attackers, where the noncooperative behaviors are well characterized by the features of the learning process in Stackelberg games.