An interval plot is used to compare groups similar to a box plot or a dot plot. It is used when the data is continuous. Instead of plotting the individual data point, an interval plot shows the confidence interval for the mean of the data. Typically, a 95% confidence interval is used but any other confidence level can be specified as well. This type of plot can be used when you are collecting a sample of the data from a large population and you want to compare different groups. For example, let’s say we want to compare the average height of two groups: men and women. You can randomly sample data from each group and then compare the groups using an interval plot. An interval plot provides information whether the two groups have similar mean values and also provides a comparison of the amount of variation present in each group.
How to create interval plot
For each group, determine the sample average (xbar). This is an unbiased estimator of the true population mean (µ). Of course, each time we collect a sample of size n, the sample average is going to be different. Hence, the predictor for the population mean cannot be a single point but has to be a range due to the uncertainty involved. We provide a range within which we expect the population mean to lie. This is called the confidence interval and is based on the variation present in the data and the amount of confidence you want in your analysis. If you specify a 95% confidence interval, then there is a 5% chance of error (α) in your analysis. We can say that we are 95% confident in our analysis results. Based on the variation in the data and the number of sample data points, we can estimate the standard error of the mean (σ/sqrt(n)). This value along with the confidence estimates based on a t distribution is used to determine the confidence intervals. The t distribution has two parameters the degrees of freedom (n-1) and the level of confidence required in the analysis. The formula for the interval plot is shown below:
How to interpret the interval plot
The width of the interval plot provides an indication of the amount of variation that is present in the data. A small interval shows more consistent data and less variation while a wide interval indicates more variation. You can also compare if the different groups overlap each other. If there is an overlap between the groups, we can conclude that the means of the population may be the same for those groups. If there is no overlap, we can conclude that the groups may be different at the given level of confidence chosen for the analysis.
Figure 1 shows the comparison of the time to repair a phone for two departments. From this figure, we can draw the following conclusions.
Department A is faster in repairing the phones compared to department B.
The amount of variation in repairing the phones is comparable for both departments.
Since there is no overlap between the confidence intervals, there may be statistical difference between the two groups.
Figure 2 shows the comparison of the time to provide customers with an answer call centers located at four different regions in the country. The following conclusions can be drawn from the interval plot:
Since there is overlap of the confidence intervals for the North and West, we can conclude that the performance of the two regions may be similar.
The amount of time in the East region seems to be significantly higher compared to the other parts of the country.
The amount of variation for the West region seems to be significantly higher compared to the other regions.
The South region seems to be very consistent but their average performance may be similar to the performance of the West region due to overlap of the confidence intervals but they seem to be doing better than North and East regions.
Follow us on LinkedIn to get the latest posts & updates.