Sign-up to receive the latest articles related to the area of business excellence.

Some of the statistical analysis assumes that the data is normally distributed, for example 1-sample t test, ANOVA, regression etc. This is because the normal distribution has some special properties which may have been used to derive the statistical properties. If the assumption of normality is not satisfied, then the results of the statistical analysis may be incorrect. Hence, when performing statistical analysis, we should always be aware if there are any assumptions about normality of data and if so, we need to check these assumptions before the use of the statistical analysis. Checking for the normality of the data was covered in a previous article. Let’s say that the data is not normally distributed. What can we do about it? How can we analyze this data using the appropriate statistical analysis? If the data is close to being normal, we may still assume it is normal and for mild departures from normality, the analysis may still give good results. However, there may be instances where the departure from normality may be significant. We will discuss three approaches to handle this situation.
Where, x is the raw data and y is the transformed data and lambda is the transformation constant. If lambda = 1, then there is no transformation. If lambda = 2, then it is the square transformation and so on. The following table provides the names of some standard transformations:
Where, Y is the transformed data, X is the raw data, and eta, epsilon, and lambda are the Johnson parameters. Decision rules have been formulated for the selection of the appropriate Johnson family of distributions SU, SB, and SL. There are several algorithms available to fit the Johnson parameters for a given data set. However, due to complex nature of these algorithms, the solutions are not very straightforward and require the use of appropriate software to estimate these parameters. Similar to a Box-Cox transformation, a computer can run through several combinations of these Johnson parameters to determine which set of parameters makes the transformed data as close to normal as possible.
Since there are several parameters to fit the Johnson transformation, we usually find that a Johnson transformation does a better job of transforming the data to a normal distribution compared to a Box-Cox transformation. Similar to the Box-Cox transformation, there is no guarantee that a Johnson transformation will be successful in transforming a data to the normal transformation.
It should be pointed out that when you transform the raw data using one of these transformations, the specification limits also need to be transformed if you need to calculate the process capability.
A Johnson transformation is also shown in the figure below. From the transformed data, it is clear that the data is transformed into a normally distributed data. The P value of the transformed data is 0.99 (normal). We can see that the Johnson transformation did an excellent job of transforming the data to a normal distribution. We can now plot the I-MR chart for this transformed data which shows that the process is in control (this topic is out of scope of this article).
MessageThe following message was sent from our server.
|
