online decision problem

Online Markov Decision Processes

Journal: :Math. Oper. Res. 2009

Eyal Even-Dar Sham M. Kakade Yishay Mansour

We consider a Markov decision process (MDP) setting in which the reward function is allowed to change after each time step (possibly in an adversarial manner), yet the dynamics remain fixed. Similar to the experts setting, we address the question of how well an agent can do when compared to the reward achieved under the best stationary policy over time. We provide efficient algorithms, which ha...

متن کامل

optimal adaptive leader-follower consensus of linear multi-agent systems: known and unknown dynamics

Journal: :journal of ai and data mining 2015

f. tatari m. b. naghibi-sistani

in this paper, the optimal adaptive leader-follower consensus of linear continuous time multi-agent systems is considered. the error dynamics of each player depends on its neighbors’ information. detailed analysis of online optimal leader-follower consensus under known and unknown dynamics is presented. the introduced reinforcement learning-based algorithms learn online the approximate solution...

متن کامل

A new collaborative system framework based on a multiple perspective approach: InteliTeam

Journal: :Decision Support Systems 2005

Ibrahim Cil Oguzhan Alpturk Harun Resit Yazgan

This study develops a Web-based collaborative system framework based on a multiple perspective approach. This framework is a recent decision support system (DSS) paradigm proposed by Courtney [Decis. Support Syst. 31 (2001) 17] for knowledge management of and decision making about a special organizational problem. It consists of four main components. The first component is a group decision-maki...

متن کامل

Decoy Effects in Financial Service E-Sales Systems

2011

Erich Christian Teppan Alexander Felfernig Klaus Isak

Users of E-Sales platforms typically face the problem of choosing the most suitable product or service from large and potentially complex assortments. Whereas the problem of finding and presenting suitable items fulfilling the user’s requirements can be tackled by providing additional support in the form of recommenderand configuration systems, the control of psychological side e↵ects resulting...

متن کامل

An MDP-Based Approach to Online Mechanism Design

2003

David C. Parkes Satinder P. Singh

Online mechanism design (MD) considers the problem of providing incentives to implement desired system-wide outcomes in systems with self-interested agents that arrive and depart dynamically. Agents can choose to misrepresent their arrival and departure times, in addition to information about their value for different outcomes. We consider the problem of maximizing the total longterm value of t...

متن کامل

A Markov Decision Process Model for an Online Empty Container Repositioning Problem in a Two-port Fixed Route

2011

Ning Shi Dongsheng Xu

In this paper, we propose a Markov Decision Process model for an empty repositioning problem in a two-port system. We consider two cases. The first case is the offline case, where demand information is assumed as a random variable with known distribution. The second case is online case where demand information is partially known. In both cases, we figure out the optimal controlling policies. T...

متن کامل

Consumer adoption of the internet as an information search and product purchase channel : some research hypotheses

2003

Byeong-Joon Moon

This study provides an exploratory model to understand the factors that influence consumers to adopt the internet instead of traditional channels for information search and product purchase. The authors reviewed previous established theories on consumer decision making in offline environments and research findings regarding consumer behaviour in an online environment. The authors embraced the c...

متن کامل

Online submodular minimization

Journal: :Journal of Machine Learning Research 2012

Elad Hazan Satyen Kale

We consider an online decision problem over a discrete space in which the loss function is submodular. We give algorithms which are computationally efficient and are Hannan-consistent in both the full information and bandit settings.

متن کامل

Robust Tests in Online Decision-Making

Journal: :Proceedings of the ... AAAI Conference on Artificial Intelligence 2022

Bandit algorithms are widely used in sequential decision problems to maximize the cumulative reward. One potential application is mobile health, where goal promote user's health through personalized interventions based on user specific information acquired wearable devices. Important considerations include type of, and frequency with which data collected (e.g. GPS, or continuous monitoring), as...

متن کامل

Influential Factors of Online Shopping Decision

Journal: :International journal of social science and business 2022

With the technology advances, online shopping has experienced a phenomenal growth. In line with such phenomenon and its relevancy, considerable number of studies have shown an interest within this area. Although recent researches particularly addressed consumer’s behavior, findings were inconsistent. Thereby, further been called for. The present study aims to investigate effect few variables de...

متن کامل