03-07-2013, 02:24 PM
Multivariate Data Analysis for STTP of Statistical Analysis in Research
Multivariate Data Analysis.pdf (Size: 594.54 KB / Downloads: 21)
Theme of Presentation
Having heard from many non statisticians while analyzing their data they facing lot of
difficulties due to lack of knowledge in the field of statistics, they are not capable to analyze
it with proper techniques and confidence. Also there are no such books available which can
guide and cover all possible techniques for data analysis with proper logic, theoretical as well
as computational through different software packages for non statistician. In this presentation
I will try to present brief survey on such statistical modeling techniques with respect to nature
of data. Also try to explain appropriate method which is applicable under different situations.
This presentation is classified with respect to measurement level of response and explanatory
variable and corresponding technique possible by demonstrating with software SPSS.
Scale of Measurement:
Statistics has its own vocabulary. Many of the terms that comprise statistical
nomenclature are familiar: some commonly used in everyday language like sample,
proportion, average etc. Statistics is a science of decision making. The decision should be
taken based on availability of data. Now the first question arises to us what is data? “Data is a
collection of information”. The information may be quantitative or qualitative. Therefore, we
categories data mainly in two parts:
i. Qualitative Data
ii. Quantitative Data
EXPLANATORY, RESPONSE VARIABLES
Let's define an explanatory variable and a response variable. Explanatory variables
are called independent variables, or X variables. Response variables are called dependent
variables or Y variables. The explanatory variable is one which is used as a predictor of the
response variable. In statistical data analysis we want to reveal the effect of the explanatory
variable on the response variable. In other words, our approach is to know how response is
influenced by the explanatory variable. For example, one might study the effectiveness of a
remedial mathematics program is influenced by factors such as the length of program, text
book, gender, classroom conditions, the characteristics of instructor, and the method of
instruction, which are all potential explanatory variables.
THE CHOICE OF STATISTICAL ANALYSIS
Regression methods describe the relationship between the response variable and one or
more explanatory variables. Usually, it is said that regression methods are used with
continuous response (dependent (y)) and explanatory (independent (x)) variables. Most
statistical methods we have learned depend on continuous data. However, we sometimes have
binary responses such as 'yes or no', 'male or female', or 'success or failure'. When the
responses are measured with binary data, it should be treated as categorical data and the
number of responses should be counted. When explanatory variables are not continuous, i.e.
dichotomous, the dummy variables are applied to distinguish the differences among
dichotomous groups. It is called a regression approach to ANOVA. On the other hand, when
response variables are discrete, taking on two (binary) or more dichotomous values, the
logistic regression model is considered.
Compute Variable: (AML Survival.sav)
It is a command of Transform Manu. It is very powerful command and use for manipulation
of data. For example, if we wish to find the average of weight of student based on three
measurement of weight, we can use compute command. Similarly, if we wish to find the
square of any quantitative variables entered in data or any kind of transformation in data, like
log transformation or generation of random number from various distributions or generation
of serial number we can use this command.
Explore Command: (Serum Cholesterol Changes.sav)
The explore command is very useful technique to study the nature (behavior) of data.
Before applying any statistical techniques for testing of hypothesis or fitting of any models
like simple linear regression model, multiple linear regression model, non linear model,
logistic regression model etc one has to first study the basic underlying assumption of such
statistical techniques. For example to use one sample t-test we have first to assess that data
come from normal distribution (symmetric distribution). For that we usually draw histogram,
normal probability plot and use Kolmogorov-Sminrov Test for fitting of distribution. This
command can be understand through the data file Serum Cholesterol Changes.sav.