21-09-2013, 04:33 PM
Reinforcement Learning
Reinforcement Learning.ppt (Size: 436 KB / Downloads: 14)
What is learning?
Learning takes place as a result of interaction between an agent and the world, the idea behind learning is that
Percepts received by an agent should be used not only for acting, but also for improving the agent’s ability to behave optimally in the future to achieve the goal.
Learning types
Learning types
Supervised learning:
a situation in which sample (input, output) pairs of the function to be learned can be perceived or are given
You can think it as if there is a kind teacher
Reinforcement learning:
in the case of the agent acts on its environment, it receives some evaluation of its action (reinforcement), but is not told of which action is the correct one to achieve its goal
RL model
Each percept(e) is enough to determine the State(the state is accessible)
The agent can decompose the Reward component from a percept.
The agent task: to find a optimal policy, mapping states to actions, that maximize long-run measure of the reinforcement
Think of reinforcement as reward
Can be modeled as MDP model!
LMS updating
Reward to go of a state
the sum of the rewards from that state until a terminal state is reached
Key: use observed reward to go of the state as the direct evidence of the actual expected utility of that state
Learning utility function directly from sequence example
Exploration problem in Active learning
A trade off when choosing action between
its immediately good(reflected in its current utility estimates using the what we have learned)
its long term good(exploring more about the environment help it to behave optimally in the long run)
Two extreme approaches
“wacky”approach: acts randomly, in the hope that it will eventually explore the entire environment.
“greedy”approach: acts to maximize its utility using current model estimate
See Figure 20.10
Just like human in the real world! People need to decide between
Continuing in a comfortable existence
Or striking out into the unknown in the hopes of discovering a new and better life