📊 Statistics - Class 9

Complete notes on data collection, representation, and measures of central tendency

1. Introduction to Statistics

Statistics is the branch of mathematics that deals with collection, organization, analysis, interpretation, and presentation of data. It helps us make sense of large amounts of information and draw meaningful conclusions.

📖 What is Statistics?

Statistics is the science of collecting, organizing, analyzing, and interpreting numerical data to make decisions or predictions.

Data: Facts or pieces of information collected for analysis.

2. Collection of Data

2.1 Types of Data

Type Description Example
Primary Data Data collected directly by the investigator for the first time Survey, Questionnaire, Interview
Secondary Data Data collected from published or unpublished sources Census reports, Books, Websites

2.2 Types of Variables

  • Raw Data: Data collected in its original form without any arrangement.
  • Ungrouped Data: Data that has not been organized into groups or classes.
  • Grouped Data: Data organized into groups or class intervals.
  • Frequency: The number of times a particular observation occurs.

3. Presentation of Data

3.1 Frequency Distribution Table

📖 Frequency Distribution

A frequency distribution is a table that shows how data is distributed across different values or class intervals.

📝 Example: Ungrouped Frequency Distribution

Q: Marks obtained by 20 students: 5, 8, 5, 6, 7, 8, 5, 9, 6, 7, 8, 5, 6, 7, 8, 9, 5, 6, 7, 8

MarksTallyFrequency
5||||5
6||||4
7||||4
8||||5
9||2
Total20

3.2 Grouped Frequency Distribution

⚠️ Key Terms

Class Interval: A range of values in which data is grouped. Example: 0-10, 10-20

Class Size/Width: Difference between upper and lower class limits. Class size = Upper limit - Lower limit

Class Mark: Mid-value of a class. Class Mark = (Lower limit + Upper limit) / 2

Class Limits: The end values of a class interval (Lower and Upper limits)

📝 Example: Grouped Frequency Distribution

Q: Marks of 30 students in a test (out of 50):

Class Interval (Marks)Class MarkFrequency
0-1053
10-20155
20-30258
30-40359
40-50455
Total30

Class Size = 10 - 0 = 10 (for first class)

4. Graphical Representation of Data

4.1 Bar Graph

  • Used to represent ungrouped data.
  • Bars are drawn with equal width and equal gaps.
  • Height of bar represents the frequency.
  • Bars do not touch each other.

4.2 Histogram

  • Used to represent grouped continuous data.
  • Bars are drawn adjacent to each other (no gaps).
  • Width of bar represents the class interval.
  • Height represents the frequency.
  • Area of bar is proportional to the frequency.

4.3 Frequency Polygon

  • A graph formed by joining the mid-points of the tops of histogram bars.
  • Can be drawn with or without histogram.
  • The polygon is closed by adding zero-frequency classes at both ends.

⚠️ Steps to Draw Frequency Polygon

1. Calculate class marks for each class interval.

2. Plot class marks on X-axis and frequencies on Y-axis.

3. Join the points with straight lines.

4. Close the polygon by joining to the X-axis on both ends.

5. Measures of Central Tendency

📖 Central Tendency

A measure of central tendency is a single value that represents the center or typical value of a data set. The three main measures are Mean, Median, and Mode.

5.1 Mean (Arithmetic Average)

📖 Mean

Mean is the sum of all observations divided by the total number of observations.

Mean (x̄) = Sum of all observations / Number of observations = Σxᵢ / n

📝 Example: Mean of Raw Data

Q: Find the mean of: 4, 6, 8, 10, 12

Solution:

Sum = 4 + 6 + 8 + 10 + 12 = 40

Number of observations (n) = 5

Mean = 40/5 = 8

Therefore, Mean = 8

5.2 Mean for Ungrouped Frequency Distribution

📖 Formula

Mean (x̄) = Σfᵢxᵢ / Σfᵢ

Where fᵢ = frequency and xᵢ = observation

📝 Example: Mean of Frequency Distribution

Q: Find the mean from the following data:

xffx
5420
10660
158120
20240
Total20240

Solution:

Mean = Σfx / Σf = 240/20 = 12

Therefore, Mean = 12

5.3 Median

📖 Median

Median is the middle value of a data set when arranged in ascending or descending order.

For odd number of observations (n):

Median = ((n+1)/2)th observation

For even number of observations (n):

Median = [(n/2)th + ((n/2)+1)th observations] / 2

📝 Example: Median (Odd Number)

Q: Find the median of: 3, 7, 2, 9, 5

Solution:

Arrange in ascending order: 2, 3, 5, 7, 9

n = 5 (odd)

Median = ((5+1)/2)th = 3rd observation = 5

Therefore, Median = 5

📝 Example: Median (Even Number)

Q: Find the median of: 4, 8, 2, 10, 6, 12

Solution:

Arrange in ascending order: 2, 4, 6, 8, 10, 12

n = 6 (even)

Median = [(6/2)th + ((6/2)+1)th] / 2

Median = (3rd + 4th observations) / 2 = (6 + 8) / 2 = 7

Therefore, Median = 7

5.4 Mode

📖 Mode

Mode is the value that occurs most frequently in a data set.

A data set can have no mode, one mode (unimodal), two modes (bimodal), or more (multimodal).

📝 Example: Mode

Q: Find the mode of: 2, 4, 4, 6, 4, 8, 6, 4

Solution:

Frequency of 2 = 1

Frequency of 4 = 4 (highest)

Frequency of 6 = 2

Frequency of 8 = 1

Therefore, Mode = 4

6. Comparison of Mean, Median, and Mode

Measure Advantages Disadvantages
Mean Uses all data values; unique value Affected by extreme values (outliers)
Median Not affected by extreme values; good for skewed data Doesn't use all values; arrangement needed
Mode Easy to find; shows most common value May not exist; may have multiple modes

⚠️ Empirical Relationship

For a moderately skewed distribution:

Mode = 3 × Median - 2 × Mean

This relationship is useful when one measure is difficult to calculate.

📚 Quick Formula Sheet - Statistics

Mean (Average)

Raw Data: x̄ = Σxᵢ / n

Frequency: x̄ = Σfᵢxᵢ / Σfᵢ

Sum divided by count

Median (Middle Value)

Odd n: ((n+1)/2)th term

Even n: Average of (n/2)th and ((n/2)+1)th terms

Middle value when arranged

Mode (Most Frequent)

Value with highest frequency

Mode = 3×Median - 2×Mean

Most common value

Class Interval

Class Mark = (Upper + Lower) / 2

Class Size = Upper - Lower

For grouped data

💡 Study Tips

• Always arrange data in order before finding median.

• Calculate cumulative frequency for frequency distribution problems.

• Check your mean calculation by estimation first.

• Remember: Mean is affected by outliers, median is not.

• For grouped data, use class marks as representative values.

🔑 Common Mistakes to Avoid

  • Forgetting to arrange data in order before finding median
  • Confusing class limits with class boundaries
  • Using wrong formula for odd/even number of observations
  • Not including all values when calculating mean