E-Books durchsuchen

Mathematical Learning Models — Theory and Algorithms [1983]

1
The Minimax Risk for the Two-Armed Bandit Problem
12
Bandit Problems with Random Discounting
26
Stochastic Approximation on a Bounded Convex Set
33
Learning Automaton for Finite Semi-Markov Decision Processes
43
A Local Asymptotic Minimax Optimality of an Adaptive Robbins Monro Stochastic Approximation Procedure
50
Dynamic Allocation Indices for Bayesian Bandits
68
The Role of Dynamic Allocation Indices in the Evaluation of Suboptimal Strategies for Families of Bandit Processes
78
On the Discretization Technique for Optimal Discounted Control of the Wiener Process
86
Asymptotic Properties of Learning Models
93
On the infinitesimal characterization of monotone stopping problems in continuous time
101
Numerical Investigation of the Two Armed Bandit
108
Uniform Bounds for a Dynamic Programming Model under Adaptive Control Using Exponentially Bounded Error Probabilities
115
Stochastic Regression Models and Consistency of the Least Squares Identification Scheme
126
Recursive Identification Techniques
138
An Optimization Problem for Matrices with Application to Decision Models
145
On a Class of Learning Algorithms with Symmetric Behavior under Success and Failure
156
Convergence of a General Stochastic Approximation Process under Convex Constraints and some Applications
168
On Kersting’s Theorem on Weak Convergence of Recursions
175
On Continuous Time Learning Models
182
Convergence of Stochastic Approximation Algorithms with Non-Additive Dependent Disturbances and Applications
191
Sequential probability ratio tests for homogeneous Markov chains
203
Allocation Rules for Sequential Clinical Trials
213
Non-Deterministic Modelling and its Application in Adaptive Optimal Control
Feedback