27-08-2014, 02:27 PM
This report is effort to present the aspects related to Web data correlation in Jasper Reports as seen by us viz. our perspective and experience. This report is written with an aim to culminate and most importantly aimed at the different aspects with which the staff approach any problem and solved it with least amount of investment of time, money and resource. This report shall help all those who shall need an insight into company.s functioning and products manufactured/ fabricated/ assembled. It also helps one in knowing depth about company.s managerial affairs as well. This report is precise and short and is aimed at making it as pictorial as possible for better understanding of the concepts discussed.
1.1 Project Definition
As huge amount of information is available on the web, sometimes a user may wish to save the web-data for future reference. This can be done to serve various purposes. Sometimes a user may wish to save invoices, reports, tickets, e-books or some brochures.
Thus, saving the web page might come handy in following situations:
1. When internet is not available by any chance.
2. The server of the website may be down and the website may not be accessible in times of need.
3. When the data is needed frequently and it requires to be stored on the local machine.
Thus, to address these issues, we are developing a web application that will allow the user to save web pages in different formats.
In the present situation, the user can do so in the following ways:
1. The screenshot of the required webpage can be taken and saved as an image but the image may not accommodate the webpage as a whole.
2. A web-page can be converted to PDF (Portable Document Format), Word document, Excel file or Powerpoint file but with certain limitations which are described below:
In case of PDF and Word file, the entire data of web page is included containing all essential as well as non-essential details like advertisements, forms etc.
In case of Excel files, the web pages can be directly exported to Excel, but the extracted data is not in proper format and it doesn.t serve much purpose.
In case of Powerpoint file, no such direct mechanism for conversion is available, rather there are a number of steps that the user needs to follow to export a webpage to a Powerpoint file.
We are developing a Web Application with the help of which user will be able to directly obtain relevant data in the formats- PDF, Word, Excel, Powerpoint.
The internet contains much information. We need to develop a tool which would facilitate us to store the web page in the formats like PDF, Powerpoint, DOC, XLS.
We will be converting a web page into PDF and DOC files with the same content as presented in the web page. Web pages will be converted into Excel file only if they contain tabular data. Web pages will be converted into Powerpoint file based on the type of data in it such as headings, lists, images and tables.
With this web application, the user will have the provision to convert a webpage to all the four formats in a convenient manner. He/she will get an option either to open or save the webpage in the required format and as per their chosen action, the web page will be opened or saved on to their local machine.It allows the user to create high-quality PDF documents quickly, efficiently and cost-effectively. He/she can use it to create high-quality invoices, reports, tickets, e-books, brochures and much more!
1.2 Scope and Objective of Project
This project basically focusses on the data present on the web. As huge amounts of data and information is available on the web, some of it is relevant and some is irrelevant. So we needed to devise a mechanism that would sort that relevant data from irrelevant and store it in a presentable manner to be used in an efficient way. Keeping this in mind, we have developed a web application that uses the huge capabilities of Jasper Reports,the world.s most popular open source reporting engine.
The basic objective of the project comprises of the following aspects:
1.2.1 Conversion into PDF:
· The relevant data of the webpage will be fetched and converted into a PDF file.
1.2.2 Conversion into Word Document:
· The relevant data of the webpage will be fetched and converted into a Word document.
1.2.3 Conversion into Excel File:
· If a webpage contains tabular data, it will be converted to an Excel file.
1.2.4 Conversion into Presentation File:
· The relevant data of the webpage will be fetched and converted into a Presentation file.
2. SYSTEM STUDY AND PROBLEM FORMULATION
2.1 Existing System
In the present situation, various web applications exist with the help of which the user can save the web pages for future reference. These applications convert the web data to different formats as it is. The formats to which conversion is convenient are PDF (Portable Document Format) and Microsoft Word document.Basically, the web page is depicted as its screen-shot in the converted files. It includes all the relevant data as well as irrelevant details to which the user is not much concerned about. Moreover, the advertisements in the pages also come along onto the converted file which is literally of no use.
As far as Powerpoint presentation and Microsoft Excel files are concerned, there is no such direct mechanism or tool available using which the user can obtain a Powerpoint presentation file or Microsoft Excel file from a web page.
Apart from the above mentioned techniques, there is another way by which Microsoft Word file, Powerpoint presentation file and Microsoft Excel file can be obtained containing web data. They have an in-built feature that imports the web pages and converts it into required format. This way is presently used but it does not serve much purpose since the layout and the feel of the web page is completely disrupted when saved this way.
One simple way is to take a screenshot of the Web page and save it as an image of desired format. But this way only the on screen display of web page can be captured and stored. Thus, the user can not get the entire required data from the web page. Moreover, saved image could not be exported into pdf, doc, xls or ppt like formats for use.
These are the all existing systems and methods being used to store data from a web page to a local machine for future reference.
2.2 Limitations of Existing System
The following limitations exist with the existing system:
1. The screenshot of the required webpage can be taken and saved as an image but the image may not accommodate the webpage as a whole.
2. The saved images are also not editable. So, it is better to convert a web page into some other format like PDF, DOCX etc.
3. A web-page can be converted to PDF, Word document, Excel file or PPT but with certain limitations as described below:
· In case of PDF and Word file, the entire data of web page is included containing all essential as well as non-essential details like advertisements, forms.
· In case of Excel files, the web pages can be directly exported to Excel, but the extracted data is not in proper format and it doesn.t serve much purpose.
· In case of Powerpoint file, no such direct mechanism for conversion is available, rather there are a number of steps that the user needs to follow to export a webpage to Powerpoint.
2.3 Proposed System
We are developing a web application that uses the enormous capabilities of Jasper Reports,the world.s most popular open source reporting engine. We are providing the user an option to convert the required web pages into the formats he desires with a single click of mouse.
The application takes into account the HTML code of the web page and parses it to achieve the relevant data out of a web page and gives the user an option to open or save the file converted to the chosen format.
We will be developing an interactive User Interface that will be easy to work with even by the novice users. User Interface will be a web application which will be developed using Java EE.
For the web pages which contain statistical data there is an option to export that data from web page into an XLS file. Similarly, the web pages containing information about some particular topics could be directly converted into an editable Powerpoint presentation document.
The web page will be having a text area for getting input (URL) from user. Different buttons will be provided like Convert to pdf, Convert to Excel, Convert to word, Convert to power point for implementing the conversion logic.
On click of a button, the data from the provided URL will be converted to the chosen format.
2.4 Advantages of Proposed System:
· Only relevant data is extracted out of a web page.
· Many conversion options available.
· The converted file can either be just opened or saved as per requirement.
· No installation required on user.s machine.
· Stand-alone application.
· Free of cost and efficient for conversions.
2.5Hardware and Software requirements
a) Deployment Environment Requirements
Hardware requirements
Processor/RAM/HDD : 2 GB RAM
Software requirements
OS for Web server : Windows 7
OS for Database Server : NA
DBMS : NA
Third Party S/Ws : NA
b) Development Environment Requirements
IDE : Eclipse IDE
Processor/RAM/HDD : Compatible with Eclipse
2.6Feasibility Study