01-12-2012, 04:13 PM
COMP 529 - Software Architecture Project Report: BusTracker
1Software Architecture.pdf (Size: 201.93 KB / Downloads: 340)
An overview of BusTracker
The goal application to be developed is a web based application, which will allow for a real-time
visualization/simulation on a map, of the estimated dynamic bus movement according to a bus schedule
that is available on the website of the Montreal public transport society (stm.info). The basic motivation
behind this project was to provide a more user friendly method of checking the bus schedules;
The stm website provides a poor user interface to check the schedule, requiring the user to navigate
through 3-4 pages before getting to the desired route stop, and then conducting another read and
comparison to the current time to be able to figure out the current bus location, or the next time it will
get to a specific stop.
The overall application functionality is, given a bus number and a direction, the system would check the
bus schedule that is posted on the Montreal transport society (stm.info) website and examine all the
stops on that route and after processing the info from the website, it would display the locations of
buses currently running on that route.
HTML Parser
• Description: "HTML Parser is a Java library used to parse HTML in either a linear or nested
fashion. Primarily used for transformation or extraction, it features filters, visitors, custom tags
and easy to use Java Beans. It is a fast, robust and well tested package." The parser used is an
open source project. It parsed the html page into a tree of different types of nodes with text at
its leaves. The extraction of data after parsing required extraction of certain types of nodes from
the list of nodes and grabbing the embedded text at its leaf-nodes.
• Features for use :
• Extraction of the data from the website using the features:
• Node list of different types of tags extractable from the parsed tree.
• text extraction out of HTML files
• link extraction out of HTML files
Strategy used to ensure the COTS functionality
Google Maps:
• To establish a decent level of confidence the project has started by implementing a proof of
concept for the most essential feature of the system which is displaying the bus icon at specific
locations on a map and to update its locations with time, by hard coding values to simulate
movements between several positions on a google map.
• The extensive documentation that the google maps api website provides, allows for better
validation and testing .
• The Component in this case exposed its API, and was used as a Grey Box
• The simplicity and isolation of the google maps API allowed for an easier validation of
functionality, especially with its visually verifiable behaviour in most cases, e.g.
• Markers placed at the returned addresses could be verified directly on the Map
• Simulated movement of markers could be verified by checking the time it took from one
intersection to the next on the map and comparing that to the time as given on the STM
schedule.
HTML Parser:
• The parser provided with easy access to the text contained in a web page but since in our case,
the required text was spread all over the web page in rows and columns in multiple tables with
only one or two words in one place and thus separated from each other by multiple tags, the
exact extraction of the required data turned out to be by trial and error where the extracted
children from the Node list of a particular kind of tag (row & column tags - td & tr) had to be
individually examined until the right ones were found. This had to be done by making a match
between the text extracted from the a given node and the expected text read from the web page
and storing the right index values.
• This kind of parser would have best worked for modifying html tags of a web or extracting large
pieces of text form a web page, but for the purpose of this project, significant work was required
to extract the right nodes. The Node List form of the parsed data as stored by the parser made
the search for the right text much easier.
• To ensure that we could make this work with the other components, we tested a
sample schedule by printing the extracted data from the website and retrieval individual
intersections by performing search on the Node Lists.
• The requirements from this COTS was for it to allow us to be able to
• re-create an address after having extracted the individual street names from a
bus-stops web page
Overview of the final architecture
The final architecture was based on the 3 components discussed above. The GoogleMapsComponent
and the HTMLparser were used as a grey box through their available APIs.
The BusServices package essentially adapts the HTML Parser to form a parser specifically for the STM
Website which is retrieved from a pre-cached folder. One bus route consists of approximately 100 html
files. The adapted parser provides methods to parse these pages for the retrieval of Bus Stop
intersections, and time-schedules. The adapted BusScheduldeParser provides a Bus Services API.
Architectural trade-offs in the BusTracker application:
During the design and implementation of the application, some trade-offs design decisions had to be
made, here is a few characterized by their quality attributes:
Usability:
Averaging neighbouring geocodes when a bus stop address cant be geocoded.
Due to the low accuracy of the addresses provided by the bus schedule website, some of these
addresses were non geocodable (e.g. mis-spelled street names), so the application made sure to
accommodate for such scenarios by running an algorithm that will fix the route by calculating an
average geocode rather than skipping that stop completely. This decision was made as a workaround of
the components mismatches that will be discussed in the following section.
Similarly if the schedule for a certain stop could not be accessed, then utilizing the information of timedifferences
between other stops on the schedule which on average came to ~2min, the stop would be
given an estimated schedule by averaging that of the preceding and following stop, rather than failing
with an error. This functionality is further enhanced with better accuracy by the 'subscription' mode of
the bus movements as the bus marker moves from the previous stop to the next stop and passes the
current stop with the missing schedule at just the right time to be standing at the next stop at its
scheduled time.
Performance:
Setting a delay between firing the sequenced geocoding requests.
The chosen design of setting a delay between each subsequent geocoding request made to the server
sacrificed some performance for a better chances of achieving higher rate of successful results. Given
the fact that firing few subsequent requests at the geocoder usually means less successes. This reduced
the performance at the cost of accuracy but still the route gets loaded in ~5-10 seconds.
Synchronizing the main thread with the asynchronous calls to the service and the GWT
Although using asynchronous method calls is intended to decrease the overhead of blocking the caller
until the process of a synchronous request finishes, the fact that the BusTracker needed to, in most
scenarios, to have the returned result of many asynch calls (Geocode all the route's stops), before
finishing a certain functionality(displaying the Route), this added some unnecessary overhead to the
synchronizing adapter in terms of blocking for an undetermined amount of time, checking every x
milliseconds whether it got all the results back from all the Async calls.