Abstract We study the problem of identifying policy space available to an agent in a learning process, having access set demonstrations generated by playing optimal considered space. introduce approach based on frequentist statistical testing identify parameters that can control, within larger parametric After presenting two identification rules (combinatorial and simplified), applicable under ...