Wednesday, 21 October 2015

regression analysis

What is regression? Who first time used regression? What is simple regression? What is multiple regression?
Regression is most popular data analysis technique in econometric .First of all we will start from the start point and that is what is regression? So the simple answer of this question is that regression analysis technique deals with the describing and evaluating relationship between projected variables (dependent and independent).We commonly denote dependent variable with Y while independent variables with X1,X2 etc. Who first time used regression ?, this term is used by Sir Francis Galton 1822-1911.He studied relationship between parents height and children height in England and that time he examined that tall parents have tall children’s .So come back toward our discussion ,we were talking about that dependent variable is denoted by Y while independent variable demonstrated by X1.X2 etc., Hence if K=1 means independent variable is only and we often called it simple regression ,on the other hand if k>1 , means we have ,more than one X variables ,its  means we have more than one independent variables , and we call it multiple regression . The crux of above discussion is that if we have one independent variable then we call it simple regression,   while more than one independent variable we call it multiple regression.
Example of simple regression
Let suppose we want to know what is impact of advertising on sale . we have only one explanatory variable, hence it is called simple regression.

Y=sale
X= advertisement
And we will denote it like this
Y=f(X)
Where f(x) is function of x.

 Example of Multiple regression
We will discuss multiple regression with the help of this example, let suppose we want to know relationship between family consumption expenditure and family size, family financial assets and family income.
Here we have
Y          =family consumptions expenditure
X1        =Family size
X2        =Family financial assets
X3        =family income
In above example we can see we have more than one independent variables so we will call it multiple regression.
Classification of variables in regression
Predictand
predictors
Regressand
Regressor
Explained variable
Explanatory variables
Dependent variable
Independent variables
Effect variable
Casual variables
Endogenous variable
Exogenous variables
Target variable
Control variables

 Specification of relationships
When we run a regression then we find two types of relationships, which are following
1.      Deterministic or mathematical relationship
2.      A statistical relationship
1. Deterministic relationship
In deterministic relationship we can find exactly effect of X on Like following example will show how to deterministic relationship work.
This is the example of deterministic relationship we will demonstrate it with the help of this equation
Y= 2500+ 100(X)-X
This is equation is develop by using Y dependent variable(sale) and X independent variable (advertising), here is following table of unit of y and X
 X(independent variable)
Y(dependent variable)
0
2500
20
4100
50
5000
100
2500
 From above table we can easily calculate effect of advertising on sale
3.      Statistical relationship
So first of all we will develop an equation for statistical relationship like
Y=2500+100(x)-x2+u
Where we suppose
U    =+ 500 with probability of ½
 =-500 with probability of ½
So here we are not sure about the exact effect of advertising on sale it may be 500 excess or less, but in deterministic trend we have exact values. And U is called error term or disturbance.
WHY WE USE ERROR TERM?
1.      Unpredictable elements of randomness in human responsiveness
e.g dependent variable is consumption expenditure while independent   variable is disposable income of household , here household not behave  like an machine , may be in one month peoples spend spree while in other or next  they spend tightfisted.
2.      Effect of large number of omitted variables
Like in our example disposable income is not the only factor which effect consumption expenditure, like family test, family size etc.so error term is catchall for the effect of all these variables.
3.      Measure error in Y.
Assumptions of a good regression
Ø   There should not be multicolinarity in our data.
Ø  Error term should not correlate with other error terms or with leg error terms (autocorrelation)
Ø  Regression line must be fitted with data strongly
Ø  Independent variables can significantly effect to dependent variable individually
Ø  And all independent variables jointly also effect dependent variable.
Ø  All the sign of coefficient must follow literature of theory.
Ø  Residual must be normal distributed
Ø  For time series analysis data must be stationary at level
What happen if we don’t fulfill above assumptions? The answer is that we can’t rely on results because of these results will be spurious, for the reliable results u must fulfill above mention assumptions

0 comments:

Search