Module III: Statistical Techniques I – Complete Formulas with Clear Explanation - Math-4

1. Measures of Central Tendency

Measure	Formula (Raw Data)	Formula (Grouped Data)	Remarks
Arithmetic Mean (\(\bar{x}\))	\(\bar{x} = \frac{\Sigma x_i}{n}\)	\(\bar{x} = \frac{\Sigma f_i x_i}{\Sigma f_i}\)	Most common average
Median	For odd n: middle term after arranging For even n: average of two middle terms	Median = \( l + \left(\frac{N/2 - C}{f}\right) \times h \)	l = lower limit, h = class width, f = frequency, C = cumulative freq. before median class
Mode	Value with highest frequency	Mode = \( l + \frac{f_1 - f_0}{2f_1 - f_0 - f_2} \times h \)	l = lower limit of modal class

2. Moments

Raw moment about origin:
\(\mu'_r = \frac{\Sigma x_i^r}{n}\) (ungrouped)
\(\mu'_r = \frac{\Sigma f_i x_i^r}{N}\) (grouped)
Central moment about mean (\(\mu_r\)):
\(\mu_r = \frac{\Sigma (x_i - \bar{x})^r}{n}\) (ungrouped)
\(\mu_r = \frac{\Sigma f_i (x_i - \bar{x})^r}{N}\) (grouped)

Important central moments:

\(\mu_1 = 0\) (always)
\(\mu_2 =\) Variance \(= \sigma^2\)
\(\mu_3 \to\) used for skewness
\(\mu_4 \to\) used for kurtosis

3. Moment Generating Function (M.G.F.)

Definition:

\[ M(t) = E(e^{tx}) = \Sigma e^{tx} p(x) \quad \text{(discrete)} \] \[ M(t) = \int e^{tx} f(x)\, dx \quad \text{(continuous)} \]

Properties:

\(M(0) = 1\)
Raw moments: \(\mu'_r = \frac{d^r M(t)}{dt^r} \bigg|_{t=0}\)
Central moments from C.G.F. = \(\ln M(t)\)

4. Skewness (Measure of Asymmetry)

Karl Pearson’s coefficient:
\(\beta_1 = \frac{\mu_3^2}{\mu_2^3}\)
\(\gamma_1 = \sqrt{\beta_1} = \frac{\mu_3}{\sigma^3}\) (range ≈ –3 to +3)
Bowley’s coefficient (quartile based):
Skewness = \(\frac{Q_3 + Q_1 - 2 \text{Median}}{Q_3 - Q_1}\)

Interpretation:

\(\gamma_1 > 0 \to\) positively skewed (tail on right)
\(\gamma_1 < 0 \to\) negatively skewed
\(\gamma_1 = 0 \to\) symmetric

5. Kurtosis (Measure of Peakedness)

Coefficient:
\(\beta_2 = \frac{\mu_4}{\mu_2^2}\)
\(\gamma_2 = \beta_2 - 3\)

Interpretation:

\(\gamma_2 > 0 \to\) Leptokurtic (sharper than normal)
\(\gamma_2 < 0 \to\) Platykurtic (flatter)
\(\gamma_2 = 0 \to\) Mesokurtic (normal curve)

6. Curve Fitting – Method of Least Squares

Principle: Minimize \(\Sigma (y_i - Y_i)^2\) where \(Y_i\) = predicted value.

Curve Type	Normal Equations	Final Equation
Straight line: \(y = a + bx\)	\(\Sigma y = na + b\Sigma x\) \(\Sigma xy = a\Sigma x + b\Sigma x^2\)	\(b = \frac{n\Sigma xy - \Sigma x \Sigma y}{n\Sigma x^2 - (\Sigma x)^2}\) \(a = \bar{y} - b\bar{x}\)
Parabola: \(y = a + bx + cx^2\)	\(\Sigma y = na + b\Sigma x + c\Sigma x^2\) \(\Sigma xy = a\Sigma x + b\Sigma x^2 + c\Sigma x^3\) \(\Sigma x^2 y = a\Sigma x^2 + b\Sigma x^3 + c\Sigma x^4\)	Solve the three equations
Exponential: \(y = a e^{bx}\)	Take ln: \(\ln y = \ln a + bx\) Let \(Y = \ln y\), then fit \(Y = A + bx\)	Same as straight line on transformed data
Geometric: \(y = ax^b\)	\(\ln y = \ln a + b \ln x\)	Fit between \(\ln y\) and \(\ln x\)

7. Correlation Analysis

Karl Pearson’s coefficient of correlation (\(r\)):
\[ r = \frac{\Sigma (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\Sigma (x_i - \bar{x})^2 \Sigma (y_i - \bar{y})^2}} = \frac{n\Sigma xy - \Sigma x \Sigma y}{\sqrt{(n\Sigma x^2 - (\Sigma x)^2)(n\Sigma y^2 - (\Sigma y)^2)}} \]
Properties: \(-1 \leq r \leq 1\)
Rank Correlation (Spearman’s):
\(\rho = 1 - \frac{6 \Sigma d_i^2}{n(n^2 - 1)}\)
where \(d_i =\) Rank\(_x\) – Rank\(_y\)

8. Regression Analysis

Regression equation of y on x: \(y = a + b x\)

\(b = r \cdot \frac{\sigma_y}{\sigma_x} = \frac{n\Sigma xy - \Sigma x \Sigma y}{n\Sigma x^2 - (\Sigma x)^2}\)
\(a = \bar{y} - b \bar{x}\)

Important Properties:

\(b_{yx} \cdot b_{xy} = r^2\)
\(r = \sqrt{b_{yx} \cdot b_{xy}}\) (sign same as b’s)
Correlation coefficient is geometric mean of regression coefficients

Summary Table of Key Formulas

Concept	Formula
Mean	\(\bar{x} = \Sigma x / n\)
Variance	\(\sigma^2 = \Sigma (x - \bar{x})^2 / n\)
Pearson’s r	\(r = \frac{n\Sigma xy - \Sigma x\Sigma y}{\sqrt{(n\Sigma x^2-(\Sigma x)^2)(n\Sigma y^2-(\Sigma y)^2)}}\)
Rank Correlation \(\rho\)	\(1 - \frac{6\Sigma d^2}{n(n^2-1)}\)
Regression slope (y on x)	\(b = \frac{n\Sigma xy - \Sigma x\Sigma y}{n\Sigma x^2 - (\Sigma x)^2}\)
Skewness (\(\gamma_1\))	\(\mu_3 / \sigma^3\)
Kurtosis (\(\gamma_2\))	\(\mu_4 / \sigma^4 - 3\)
Relation between r and b’s	\(r^2 = b_{yx} \cdot b_{xy}\)

These are all the standard formulas and concepts from Module III (Statistical Techniques I) as per most Indian university syllabi (Anna University, Mumbai University, etc.). Practice numerical problems using these exact formulas for best exam performance.