Pictures they say is worth a thousand words! Whenever you collect data to analyze any problem or to quantify any process you should always consider plotting the data to see what the data is actually telling you. Of course, you need to pick the right chart as there are a lot of different ways and charts to plot the data. This module covers a high level overview of which chart to pick for a given data set.
The chart you pick depends on the type of data you have and the objective of plotting the data – what is the question you are trying to answer? There are separate charts for continuous data and separate charts for discrete data. The following flow charts can help you pick the right chart or charts to analyze your data set.
If you have continuous data and you want to determine if a process is stable then you can pick the time series chart, run chart or the control chart. A time series chart can help you visually determine if a process is stable while a run chart and control charts gives you specific markers or indicators to tell you if a process is stable.
If your data was collected in groups – for example, sales data was collected for different regions of the country (North, South, East, and West) and you want to compare the sales data by region, you could use the box plot, dot plot or the individual value plot.
If you want to determine the nature of your distribution – for example to check if a data follows a normal distribution then you could either use the probability plot or the histogram. For a more statistical analysis, you would perform the normality test.
If you are more interested in where the variability is coming up – for example you have too much variation in the manufacture of a product and you want to know if it is coming from the machine that is making it, the time of the day the product is being made, the operator who is making the product etc., then you could use either the Box Plot or the Multi-Vari chart.
If you have two sets of data and want to determine if they are correlated, for example if you have humidity values and product quality readings and you want to check if there is any correlation between humidity and product quality, you could use the scatter plot to understand this correlation.
If your data is discrete and you want to compare various groups then either a bar chart or a pie chart would be useful. A bar chart focuses on the absolute value of the data while a pie chart will focus on the relative value or proportion of the data since the total adds us to one circle (100%)
If you have a lot of categories in your data and want to determine the vital few from the trivial many (80-20 rule), then you could use the Pareto chart to identify the most important factors or causes.
It is always recommended that you should chart your data whenever you collect data in order to interpret what the data is telling you. However, you should be aware that making interpretations based on just the chart alone can be prone to mistakes – to make an actual decision you will need to perform additional statistical analysis. The chart should be used to identify potential trends which can be validated with further analysis. The chart should also be used to ensure that the analysis that you perform is correct and you are not misinterpreting the results or drawing the wrong conclusions.
To create any of the charts mentioned in this article, refer to https://www.sigmamagic.com/.
Follow us on LinkedIn to get the latest posts & updates.