What is regression? Who
first time used regression? What is simple regression? What is multiple
regression?
Regression is most popular data analysis technique in
econometric .First of all we will start from the start point and that is what
is regression? So the simple answer of this question is that regression
analysis technique deals with the describing and evaluating relationship
between projected variables (dependent and independent).We commonly denote
dependent variable with Y while independent variables with X1,X2 etc. Who first
time used regression ?, this term is used by Sir Francis Galton 1822-1911.He
studied relationship between parents height and children height in England and
that time he examined that tall parents have tall children’s .So come back
toward our discussion ,we were talking about that dependent variable is denoted
by Y while independent variable demonstrated by X1.X2 etc., Hence if K=1 means independent
variable is only and we often called it simple regression ,on the other hand if
k>1 , means we have ,more than one X variables ,its means we have more than one independent
variables , and we call it multiple regression . The crux of above discussion
is that if we have one independent variable then we call it simple regression, while
more than one independent variable we call it multiple regression.
Example of simple
regression
Let suppose we want to know what is impact of advertising on
sale . we have only one explanatory variable, hence it is called simple
regression.
Y=sale
X= advertisement
And we will denote it like this
Y=f(X)
Where f(x) is function of x.
Example of Multiple regression
We will discuss multiple regression with the help of this
example, let suppose we want to know relationship between family consumption
expenditure and family size, family financial assets and family income.
Here we have
Y =family
consumptions expenditure
X1 =Family size
X2 =Family
financial assets
X3 =family income
In above example we can see we have more than one independent
variables so we will call it multiple regression.
Classification of variables in regression
|
|
Predictand
|
predictors
|
Regressand
|
Regressor
|
Explained variable
|
Explanatory
variables
|
Dependent variable
|
Independent
variables
|
Effect variable
|
Casual variables
|
Endogenous
variable
|
Exogenous
variables
|
Target variable
|
Control variables
|
Specification of relationships
When we run
a regression then we find two types of relationships, which are following
1.
Deterministic or mathematical
relationship
2.
A statistical relationship
1. Deterministic relationship
In deterministic
relationship we can find exactly effect of X on Like following example will
show how to deterministic relationship work.
This is the
example of deterministic relationship we will demonstrate it with the help of
this equation
Y= 2500+ 100(X)-X2
This is equation is develop by using
Y dependent variable(sale) and X independent variable (advertising), here is
following table of unit of y and X
X(independent variable)
|
Y(dependent variable)
|
0
|
2500
|
20
|
4100
|
50
|
5000
|
100
|
2500
|
From above table we can easily calculate
effect of advertising on sale
3.
Statistical relationship
So first of all we will develop an equation for statistical
relationship like
Y=2500+100(x)-x2+u
Where we suppose
U =+ 500 with
probability of ½
=-500 with probability of ½
So here we are not sure about the exact effect of advertising
on sale it may be 500 excess or less, but in deterministic trend we have exact
values. And U is called error term or disturbance.
WHY WE USE ERROR TERM?
1.
Unpredictable elements of randomness
in human responsiveness
e.g
dependent variable is consumption expenditure while independent variable is disposable income of household ,
here household not behave like an
machine , may be in one month peoples spend spree while in other or next they spend tightfisted.
2.
Effect of large number of omitted
variables
Like in our example
disposable income is not the only factor which effect consumption expenditure,
like family test, family size etc.so error term is catchall for the effect of
all these variables.
3.
Measure error in Y.
Assumptions of a good regression
Ø There should not be multicolinarity in our data.
Ø Error term should not correlate with
other error terms or with leg error terms (autocorrelation)
Ø Regression line must be fitted with
data strongly
Ø Independent variables can
significantly effect to dependent variable individually
Ø And all independent variables jointly
also effect dependent variable.
Ø All the sign of coefficient must
follow literature of theory.
Ø Residual must be normal distributed
Ø For time series analysis data must be
stationary at level
What happen
if we don’t fulfill above assumptions? The answer is that we can’t rely on
results because of these results will be spurious, for the reliable results u
must fulfill above mention assumptions




0 comments:
Post a Comment