18-04-2012, 02:15 PM
Intelligent Traffic Light Control
trffic_light_cont.pdf (Size: 275.23 KB / Downloads: 39)
Introduction
Transportation research has the goal to optimize transportation flow of people and goods.
As the number of road users constantly increases, and resources provided by current infrastructures
are limited, intelligent control of traffic will become a very important issue in the
future. However, some limitations to the usage of intelligent traffic control exist. Avoiding
traffic jams for example is thought to be beneficial to both environment and economy, but
improved traffic-flow may also lead to an increase in demand [Levinson, 2003].
There are several models for traffic simulation. In our research we focus on microscopic
models that model the behavior of individual vehicles, and thereby can simulate dynamics
of groups of vehicles. Research has shown that such models yield realistic behavior
[Nagel and Schreckenberg, 1992, Wahle and Schreckenberg, 2001].
Cars in urban traffic can experience long travel times due to inefficient traffic light control.
Optimal control of traffic lights using sophisticated sensors and intelligent optimization
algorithms might therefore be very beneficial. Optimization of traffic light switching increases
road capacity and traffic flow, and can prevent traffic congestions. Traffic light control is a
complex optimization problem and several intelligent algorithms, such as fuzzy logic, evolutionary
algorithms, and reinforcement learning (RL) have already been used in attempts
to solve it. In this paper we describe a model-based, multi-agent reinforcement learning
algorithm for controlling traffic lights.
In our approach, reinforcement learning [Sutton and Barto, 1998, Kaelbling et al., 1996]
with road-user-based value functions [Wiering, 2000] is used to determine optimal decisions
for each traffic light. The decision is based on a cumulative vote of all road users standing
for a traffic junction, where each car votes using its estimated advantage (or gain) of setting
its light to green. The gain-value is the difference between the total time it expects to wait
during the rest of its trip if the light for which it is currently standing is red, and if it is green.
The waiting time until cars arrive at their destination is estimated by monitoring cars flowing
through the infrastructure and using reinforcement learning (RL) algorithms.
We compare the performance of our model-based RL method to that of other controllers
using the Green Light District simulator (GLD). GLD is a traffic simulator that allows us
to design arbitrary infrastructures and traffic patterns, monitor traffic flow statistics such as
average waiting times, and test different traffic light controllers. The experimental results
show that in crowded traffic, the RL controllers outperform all other tested non-adaptive
controllers. We also test the use of the learned average waiting times for choosing routes of
cars through the city (co-learning), and show that by using co-learning road users can avoid
bottlenecks.
This paper is organized as follows. Section 2 describes how traffic can be modelled,
predicted, and controlled. In section 3 reinforcement learning is explained and some of its
applications are shown. Section 4 surveys several previous approaches to traffic light control,
and introduces our new algorithm. Section 5 describes the simulator we used for our experiments,
and in section 6 our experiments and their results are given. We conclude in section
Modelling and Controlling Traffic
In this section, we focus on the use of information technology in transportation. A lot of
ground can be gained in this area, and Intelligent Transportation Systems (ITS) gained interest
of several governments and commercial companies [Ten-T expert group on ITS, 2002,
White Paper, 2001, EPA98, 1998].
ITS research includes in-car safety systems, simulating effects of infrastructural changes,
route planning, optimization of transport, and smart infrastructures. Its main goals are:
improving safety, minimizing travel time, and increasing the capacity of infrastructures. Such
improvements are beneficial to health, economy, and the environment, and this shows in the
allocated budget for ITS.
In this paper we are mainly interested in the optimization of traffic flow, thus effectively
minimizing average traveling (or waiting) times for cars. A common tool for analyzing traffic
is the traffic simulator. In this section we will first describe two techniques commonly used
to model traffic. We will then describe how models can be used to obtain real-time traffic
information or predict traffic conditions. Afterwards we describe how information can be
communicated as a means of controlling traffic, and what the effect of this communication on
traffic conditions will be. Finally, we describe research in which all cars are controlled using
computers.
Modelling Traffic
Traffic dynamics bare resemblance with, for example, the dynamics of fluids and those of sand
in a pipe. Different approaches to modelling traffic flow can be used to explain phenomena
specific to traffic, like the spontaneous formation of traffic jams. There are two common
approaches for modelling traffic; macroscopic and microscopic models.
Macroscopic models.
Macroscopic traffic models are based on gas-kinetic models and use equations relating traffic
density to velocity [Lighthill and Whitham, 1955, Helbing et al., 2002]. These equations can
be extended with terms for build-up and relaxation of pressure to account for phenomena like
stop-and-go traffic and spontaneous congestions [Helbing et al., 2002, Jin and Zhang, 2003,
Broucke and Varaiya, 1996]. Although macroscopic models can be tuned to simulate certain
driver behaviors, they do not offer a direct, flexible, way of modelling and optimizing them,
making them less suited for our research.
Microscopic models.
In contrast to macroscopic models, microscopic traffic models offer a way of simulating various
driver behaviors. A microscopic model consists of an infrastructure that is occupied by a set
of vehicles. Each vehicle interacts with its environment according to its own rules. Depending
on these rules, different kinds of behavior emerge when groups of vehicles interact.
Cellular Automata. One specific way of designing and simulating (simple) driving rules
of cars on an infrastructure, is by using cellular automata (CA). CA use discrete partially
connected cells that can be in a specific state. For example, a road-cell can contain a car
or is empty. Local transition rules determine the dynamics of the system and even simple
rules can lead to chaotic dynamics. Nagel and Schreckenberg (1992) describe a CA model
for traffic simulation. At each discrete time-step, vehicles increase their speed by a certain
amount until they reach their maximum velocity. In case of a slower moving vehicle ahead,
the speed will be decreased to avoid collision. Some randomness is introduced by adding for
each vehicle a small chance of slowing down. Experiments showed realistic behavior of this
CA model on a single road with emerging behaviors like the formation of start-stop waves
when traffic density increases.
Cognitive Multi-Agent Systems.
A more advanced approach to traffic simulation and
optimization is the Cognitive Multi-Agent System approach (CMAS), in which agents interact
and communicate with each other and the infrastructure. A cognitive agent is an entity that
autonomously tries to reach some goal state using minimal effort. It receives information
from the environment using its sensors, believes certain things about its environment, and
uses these beliefs and inputs to select an action. Because each agent is a single entity, it
can optimize (e.g., by using learning capabilities) its way of selecting actions. Furthermore,
using heterogeneous multi-agent systems, different agents can have different sensors, goals,
behaviors, and learning capabilities, thus allowing us to experiment with a very wide range
of (microscopic) traffic models.