Abstract
This paper considers an important class of Stackelberg security problems, which is characterized by the fact that defenders and attackers have incomplete information at each stage about the value of the current state. The inability to observe the exact state is motivated by the fact that it is impossible to measure exactly the state variables of the defenders and attackers. Most existing approaches for computing Stackelberg security games provide no guarantee if the estimated model is inaccurate.
In order to solve this drawback, this paper presents several important results. First, it provides a novel solution for computing the Stackelberg security games for multiple players, considering finite resource allocation in domains with incomplete information. This new model is restricted to a partially observable Markov model. Next, we suggest a two-step iterative proximal/gradient method for computing the Stackelberg equilibrium for the security game: in each step, the algorithm solves an independent convex nonlinear programming problem implementing a regularized penalty approach. Regularization ensures the convergence to one (unique) of the possible equilibria of the Stackelberg game. To make the problem computationally tractable, we define the -variable method for partially observable Markov games. Third, we show by simulation that our proposed model overcomes the disadvantages of previous Stackelberg security games solvers. Hence, as our final contribution, we present a new random walk method based on partial information. A numerical example for protecting airports terminals suggests the effectiveness of the proposed method, presenting the resulting patrolling strategies and two different realizations of the Stackelberg security game employing the partially observable random walk algorithm.