28-02-2013, 11:45 AM
Indian Population Stastics
Indian Population.docx (Size: 77.16 KB / Downloads: 22)
Abstract:
The main objective of our project is calculation of mean, standard deviation, variation, correlation of Indian population using R software. This project is developed to calculate mean, standard deviation, variation, correlation of large data which is collected from the sources. We are using R software for calculating objective.
• Mean
• Standard Deviation
• Variation
• Correlation
• Regression
Introduction:
In our project we are using R-software for calculating the all objectives. R-software is good software for analysis the stastics data.
We are going to calculate the mean , standard deviation, variation, correlation, regression of Indian population stastics of about 100 years.
And also we are going to plot a R graph for population and the years.
Data Collection:
We collected population statistics of India for around 100 years.
The data is stored with .CSV extension & this csv finally read in the R software.
All the data is included at the end of the document.
Concepts used in our project:
• Mean
• Standard Deviation
• Variation
• Correlation
• Regression
Definitions of the concepts used:
• Mean:is the central tendency of a collection of numbers taken as the sum of the numbers divided by the size of the collection.
• Standard Deviation: standard deviation (represented by the symbol sigma, σ) shows how much variation or "dispersion" exists from the average (mean, or expected value)
• Variation:The coefficient of variation (CV) is defined as the ratio of the standard deviation to the mean
• Correlation:It is widely used in the sciences as a measure of the strength of linear dependence between two variables.
• Regression:Correlation is a measure of association between two variables. The variables are not designated as dependent or independent.
Tools:
R Software and .csv file
• About R Software: R is an open source programming language and software environment for statistical computing and graphics.
• The R language is widely used among statisticians and data miners for developing statistical software and data analysis.
• Polls and surveys of data miners are showing R's popularity has increased substantially in recent years.
• R provides a wide variety of statistical and graphical techniques, including linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, and others.
• R is easily extensible through functions and extensions, and the R community is noted for its active contributions in terms of packages.an effective data handling and storage facility,
• a suite of operators for calculations on arrays, in particular matrices,
• a large, coherent, integrated collection of intermediate tools for data analysis,
• graphical facilities for data analysis and display either directly at the computer or on hardcopy, and
• a well developed, simple and effective programming language (called ‘S’) which includes conditionals, loops, user defined recursive functions and input and output facilities. (Indeed most of the system supplied functions are themselves written in the S language.)