نتایج جستجو برای: regret minimization
تعداد نتایج: 37822 فیلتر نتایج به سال:
In standard online learning, the goal of the learner is to maintain an average loss close to the loss of the best-performing function in a fixed class. Classic results show that simple algorithms can achieve an average loss arbitrarily close to that of the best function in retrospect, even when input and output pairs are chosen by an adversary. However, in many real-world applications such as s...
In this thesis, we investigate the problem of decision-making in large two-player zero-sumgames using Monte Carlo sampling and regret minimization methods. We demonstrate fourmajor contributions. The first is Monte Carlo Counterfactual Regret Minimization (MC-CFR): a generic family of sample-based algorithms that compute near-optimal equilibriumstrategies. Secondly, we develop a...
The menu-dependent nature of regret-minimization creates subtleties in applying regret-minimization to dynamic decision problems. Firstly, it is not clear whether forgone opportunities should be included in the menu. We explain commonly observed behavioral patterns as minimizing regret when forgone opportunities are present, and also show how the treatment of forgone opportunities affects behav...
This paper provides an empirical comparison between utility-maximization and regretminimization perspectives of spatial-choice behaviour. The key difference between these two perspectives is that the regret-minimization perspective implies that the anticipated satisfaction associated with a chosen spatial alternative depends on the anticipated performance of nonchosen alternatives. In order to ...
We study regret minimization bounds in which the dependence on the number of experts is replaced by measures of the realized complexity of the expert class. The measures we consider are defined in retrospect given the realized losses. We concentrate on two interesting cases. In the first, our measure of complexity is the number of different “leading experts”, namely, experts that were best at s...
We consider auctions in which the players have very limited knowledge about their own valuations. Specifically, the only information that a Knightian player i has about the profile of true valuations, θ∗, consists of a set of distributions, from one of which θ∗ i has been drawn. We analyze the social-welfare performance of the VCG mechanism, for unrestricted combinatorial auctions, when Knighti...
We consider regret minimization in adversarial deterministic Markov Decision Processes (ADMDPs) with bandit feedback. We devise a new algorithm that pushes the state-of-theart forward in two ways: First, it attains a regret of O(T ) with respect to the best fixed policy in hindsight, whereas the previous best regret bound was O(T ). Second, the algorithm and its analysis are compatible with any...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید