25-08-2017, 09:32 PM
Part of the activity of a search engine is to analyse the text of HTML pages to determine the URLs of further pages to examine. Whilst examining the contents of pages it would also be possible to generate a variety of statistical reports about the HTML. As well as generating reports on the size of pages, it should also be possible to determine the number and size of included images and dismantle frame based pages. The project is to develop software that will perform the required analysis as part of a larger system.