Dispersion Measures In Statistics

Dispersion measures are important because they talk about the variability we find in a given sample or population. When we talk about a sample, this dispersion is important because it conditions the error that we will find when making inferences for measures of central tendency, such as the mean.
Dispersion measures in statistics

In a data distribution, dispersion measures play a very important role. These measures complement the central position measures, characterizing the variability of the data.

Thus, measures of central tendency indicate relative values ​​to which data appear to cluster. They are recommended to infer the behavior of variables in populations and samples. Some examples of them are the arithmetic mean, the mode or the median (1).

Dispersion measures complement these measures of central tendency. Furthermore, they are essential in a data distribution. This is because they characterize the variability of the data. Its relevance in statistical training was raised by Wild and Pfannkuch (1999).

In these measures, the perception of data variability is one of the basic components in statistical thinking. The perception of this variability provides us with information about the dispersion of data in relation to an average.

The arithmetic mean is widely used in practice, but it can often be misinterpreted. This will happen when variable values ​​are too far apart. It is on these occasions that it is necessary to follow the average of the dispersion measures (2).

In dispersion measures, there are three important components related to random variability (2):

  • The perception of its ubiquity in the world around us.
  • The competence for your explanation.
  • The ability to quantify it (which implies understanding and knowing how to apply the concept of dispersion).
Man turning his ideas into reality

What are dispersion measures for?

In a statistical study, when it comes to generalizing data from a sample of a population, dispersion measures are very important, as they directly condition the error with which we work. Thus, the more dispersions we collect in a sample, the more volume we will need to work with the same error.

On the other hand, these measures help us determine if our data strays too far from the core value. Thus, they provide us with information about whether this core value is adequate to represent the study population.

These measures are very useful for comparing distributions and understanding risk in decision making (1). The greater the dispersion, the less representative the central value. These are the most used:

  • Amplitude.
  • Average deviation.
  • Variance.
  • Standard deviation.
  • Coefficient of variation.

Functions of each of the dispersion measures in statistics

Amplitude

First, the amplitude is recommended for a primary comparison. In this way, it considers only the two extreme observations. Therefore, it is only recommended for small samples (1). It is defined as the difference between the last variable value and the first (3).

Whole and sliced ​​apples

average deviation

In turn, the mean deviation indicates where the data would be concentrated if all were at the same distance from the arithmetic mean (1). We consider the deviation of a variable value as the difference in absolute value between that variable value and the arithmetic mean of the series. Thus, the arithmetic mean of the deviations (3) is considered.

variance

The variance is an algebraic function of all values, suitable for inferential statistics tasks (1). It can be defined as the squared deviations.

Standard deviation

For samples drawn from the same population, the standard deviation is the most used (1). It is the square root of the variance (3).

Coefficient of variation

It is a measure used primarily to compare the variance between two sets of data measured in different units. For example, height and body weight of students in a sample. Thus, it is used to determine in which distribution the data are more grouped and the mean is more representative (1).

Dispersion measures in statistics

The coefficient of variation is a more representative measure of dispersion than the previous ones because it is an abstract number. That is, it is independent of the units in which the variable values ​​appear. In general, this coefficient of variation is usually expressed as a percentage (3).

Thus, these dispersion measures will indicate, on the one hand, the degree of variability that exists in the sample. On the other hand, they will indicate the representativeness of the central value, since if a small value is obtained, it will mean that the values ​​are concentrated around this center.

This will mean that there is little variability in the data and the center adequately represents everyone. On the other hand, if the value obtained is large, it means that the values ​​are not concentrated, but rather dispersed. This will mean that there is a lot of variability and the center will not be very representative. On the other hand, when making inferences, we will need a larger sample size if we want to reduce the error, which is increased exactly due to the increase in variability.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *


Back to top button