Policy Iteration for Learning an Exercise Policy for American Options

نویسندگان

Yuxi Li

Dale Schuurmans

چکیده

Options are important financial instruments, whose prices are usually determined by computational methods. Computational finance is a compelling application area for reinforcement learning research, where hard sequential decision making problems abound and have great practical significance. In this paper, we investigate reinforcement learning methods, in particular, least squares policy iteration (LSPI), for the problem of learning an exercise policy for American options. We also investigate TVR, another policy iteration method. We compare LSPI, TVR with LSM, the standard least squares Monte Carlo method from the finance community. We evaluate their performance on both real and synthetic data. The results show that the exercise policies discovered by LSPI and TVR gain larger payoffs than those discovered by LSM, on both real and synthetic data. Furthermore, for LSPI, TVR and LSM, policies learned from real data generally gain larger payoffs than policies learned from simulated samples. Our work shows that solution methods developed in reinforcement learning can advance the state of the art in an important and challenging application area, and demonstrates furthermore that computational finance remains an under-explored area for deployment of reinforcement learning methods.

متن کامل

منابع مشابه

Learning an Exercise Policy for American Options from Real Data

We study approaches to learning an exercise policy for American options directly from real data. We investigate an approximate policy iteration method, namely, least squares policy iteration (LSPI), for the problem of pricing American options. We also extend the standard least squares Monte Carlo (LSM) method of Longstaff and Schwartz, by composing sample paths from real data. We test the perfo...

متن کامل

Learning Exercise Policies for American Options

Options are important instruments in modern finance. In this paper, we investigate reinforcement learning (RL) methods— in particular, least-squares policy iteration (LSPI)—for the problem of learning exercise policies for American options. We develop finite-time bounds on the performance of the policy obtained with LSPI and compare LSPI and the fitted Q-iteration algorithm (FQI) with the Longs...

متن کامل

Policy iteration for american options: overview

This paper is an overview of recent results by Kolodko and Schoenmakers (2006), Bender and Schoenmakers (2006) on the evaluation of options with early exercise opportunities via policy improvement. Stability is discussed and simulation results based on plain Monte Carlo estimators for conditional expectations are presented.

متن کامل

Learning Robust Options

Robust reinforcement learning aims to produce policies that have strong guarantees even in the face of environments/transition models whose parameters have strong uncertainty. Existing work uses value-based methods and the usual primitive action setting. In this paper, we propose robust methods for learning temporally abstract actions, in the framework of options. We present a Robust Options Po...

متن کامل

Enhanced policy iteration for American options via scenario selection

In Kolodko & Schoenmakers [9] and Bender & Schoenmakers [3] a policy iteration was introduced, which allows to achieve tight lower approximations of the price for early exercise options via a nested Monte-Carlo simulation in a Markovian setting. In this paper we enhance the algorithm by a scenario selection method. It is demonstrated by numerical examples that the scenario selection can signifi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Policy Iteration for Learning an Exercise Policy for American Options

نویسندگان

چکیده

منابع مشابه

Learning an Exercise Policy for American Options from Real Data

Learning Exercise Policies for American Options

Policy iteration for american options: overview

Learning Robust Options

Enhanced policy iteration for American options via scenario selection

عنوان ژورنال:

اشتراک گذاری