Accuracy: It refers to how close a sample statistic is to the population parameter. If the mean value of your measurements is close to the true value, we would call the measurement system accurate.
Alpha: A Type I error occurs when we reject a null hypothesis when it is true. The probability of committing a Type I error is called the significance level or alpha.
Alternative Hypothesis (Ha): The hypothesis that sample observations are influenced by some non-random noise. That is, there is a difference between the populations being studied.
Attribute Data: Data that has a quality characteristic (or attribute) that can be classified as either meets or does not meet product specification. Examples include the type of defects.
Benchmarking: The practice of assessing processes and/or performance either within the company or outside the company and identifying best practices.
Black Belt: A full-time resource trained in DMAIC or DMADV whose primary role is to lead projects. Note that in some cases, the resources may also work part-time on projects.
Box Plot: A box plot is a type of graph that is used to display patterns of quantitative data. A box plot splits the data into quartiles and data within the first and third quartiles are shown inside the box and any data outside are indicated by either whiskers or outliers.
Categorical Data: A variable that can take one of a limited and usually fixed number of possible values. Example include blood type, country a person lives in etc.
Cause & Effect Diagram: A tool used to organize possible causes for a specific problem or effect by graphically displaying. It may also be called fishbone or Ishikawa diagram.
Central Limit Theorem (CLT): The central limit theorem states that the sample distribution of the mean of any independent random variable will be normal or nearly normal if the sample size is large enough.
Charter: A document that outlines the purpose and plan for a project. It typically contains the problem, scope, goal, timeline, team members and expected benefits.
Check Sheet: A simple form for tracking events using tally marks to indicate the frequency of occurrence.
Cluster Sampling: It is a sampling method where the population is divided into N groups (or clusters) and we randomly select n clusters to include in the sample. Note that each element of the population can be assigned to only one cluster.
Common Cause Variation: A random distribution of change in a process measurement around the mean value of the data caused by unknown factors.
Confidence Level: Percentage of all possible samples that can be expected to include the true population parameter.
Containment: It is a temporary intervention used to minimize a problem’s impact on the customer while the improvement team works to resolve the issue with a permanent solution.
Continuous Data: Information measured on a continuum that can be meaningfully subdivided into infinitely small increments. Examples include time, temperature, weight, currency.
Control Chart: A time-ordered plot of data that includes a center-line and control limits and it is used to identify if a process is stable.
Correlation: The correlation coefficient is a measure of the strength of association between two variables. The correlation coefficient lies between -1 and +1. If the correlation is strong, it is closer to -1 or +1 and close to 0 if the correlation is weak.
Critical Customer Requirement (CCR): A key measurable characteristic of a produce or process whose performance standards are dictated by the customer.
Dashboard: A display that summarizes key process measurements that directly affect the customers or the business. The objective of a dashboard is to provide a quick snapshot of what is working and what is not.
Data Type: The type of data we are working with can be classified as qualitative or quantitative. We can also classify the data as attribute, categorical, discrete or continuous data.
Discrete Data: Data that can take certain values. It cannot be broken down into smaller units and add additional meaning. Example: the number of students in a class (integers).
DMADV: It is a five phase data driven Six Sigma methodology to design new products and services. Its phases are Define, Measure, Analyze, Design, and Validate.
DMAIC: It is a five phase data driven Six Sigma methodology to improve existing processes. Its phases are Define, Measure, Analyze, Improve, and Control.
Dummy Variable: In regression analysis, a dummy variable is a numeric variable that represents categorical data such as gender, race etc.
Experimental Design: Experimental design refers to a plan for assigning subjects to treatment conditions in a structured way so as the minimize the number of runs required to draw conclusions. The objective of an experiment could be to make inferences between the relationship between independent and dependent variable or to determine the combination of inputs required to optimize the output(s).
Factor: In a designed experiment, a factor is an independent variable that is manipulated by the experimenter. The various values a factor takes are called levels. There can be 2 levels, 3 levels, 4 levels etc. Example, the temperature of baking could be a factor with levels 100 degrees and 150 degrees.
FMEA: Failure Modes and Effects Analysis is a disciplined approach used to identify possible failures of a product or service, prioritize the risks using severity, frequency of occurrence and detectability and take corrective actions to reduce the risk.
Gage R&R: Gage Repeatability and Reproducibility is a statistical tool used to measure the amount of variation in the measurement system arising from the measurement device and/or the people using the measurement system.
Green Belt: An individual trained in the DMAIC or DMADV methodologies who typically work part-time on Six Sigma projects within their own functional areas.
Histogram: A histogram is a graphical representation of data that is made up of columns plotted as a graph. The continuous variable is broken up into bins and represent the X-axis of the histogram and the heights of the columns represent the number of data points in each bin. Typically used to identify the shape of the distribution.
Hypothesis Test: A set of statistical tools (such as t Test, F Test, ANOVA, etc.) used to determine whether the observed differences in samples are due to random chance or due to true differences between the samples.
Interaction Plot: An interaction plot is a line graph that reveals the presence or absence of interactions among independent variables.
Interquartile Range (IQR): The interquartile range is the difference between the third quartile and the first quartile (Q3-Q1) and is a measure of the variability in the data set.
Master Black Belt (MBB): An MBB is an expert in DMAIC and/or DMADV methodology. An MBB typically trains and coaches Six Sigma resources such as Black Belts and Green Belts. MBB participates in project reviews and helps the team obtain resources and/or ensure rigor of the Six Sigma methodology is maintained.
Mean: Average value of the data set.
Measurement Systems Analysis (MSA): An experiment conducted to determine what portion of the total process variation of the measured data is coming due to measurement errors.
Median: The central value of a data set whose values are arranged in the increasing or decreasing order. 50% of the data points are below the median and 50% are above the median.
Mistake Proofing: A technique for eliminating or minimizing the impact of errors that cause defects. Also known as Poka-Yoke.
Multicollinearity: Multicollinearity refers to the extent to which independent variables are correlated. This can occur if one variable is correlated to another variable or if one variable is a linear combination of two or more independent variables. If there is multicollinearity, the test results can be misleading.
Mode: The most frequently occurring value in a data set. Note that if no value occurs more frequently than others a mode may not exist.
Multi-Generational Project Plan (MGPP): An MGPP is a strategic planning document that shows the various generations of product or processes used to reach a long-term vision.
Nonvalue Added Steps (NVA): Steps in a production and delivery of a product or service that is considered nonessential in meeting the customer needs.
Normal Distribution: The normal distribution is a probability distribution that resembles a symmetric bell-shaped curve and is characterized by two parameters: mean and standard deviation.
Null Hypothesis (Ho): The hypothesis that there is no significant difference between the specified populations and an observed differences is due to sampling or experimental error.
Operational Definition: A detailed description of a process, measurement or activity written to ensure common understanding.
Opportunity: Any characteristic of a unit that is inspected, measured, or tested and provides a chance of not meeting a customer requirement.
Outlier: A data point located beyond a specific range and represents a statistically unexpected event. It may be due to process inability to satisfy customer needs or data entry error.
Output Measure: An indicator of process performance captured at the end of a process and links to the process’s ability to satisfy customer requirements.
P Value: A P-value is a measure of the strength of evidence in support of a null hypothesis. The P-value is the probability of observing a test statistic assuming that the null hypothesis is true.
Pareto Chart: A graphical tool that prioritizes factors in order to separate the vital few causes from the trivial many issues.
Performance Standard: A boundary that defines acceptable values for the measurement of a product or process as determined by customer needs. It may be one-sided or two-sided. It is also called a Specification Limit.
Pilot: A test of a proposed solution on a small scale in a real business environment to verify that the improvement goal has been met and/or identify issues that must be addressed before a full-scale rollout of the solution.
Precision: Precision refers to how close estimates from different samples are close to each other. It refers to the amount of variation present in our readings. If the variation is small, then we can say our readings are precise.
Probability Distribution: A probability distribution is a table or equation that links each outcome of a statistical experiment with its probability of occurrence. For example, for a coin toss, the probability of heads is 0.5 and tails is 0.5.
Process: A sequence or series of steps / activities that transforms one or more inputs into one or more outputs to meet customer needs.
Process Capability: The ability of a process to produce a defect-free product or service.
Process Map: An illustration of the steps, events, and operations in chronological order that make up a process.
Quartiles: Quartiles divide a rank-ordered data set into four equal parts. The values that divide each part are called the first (Q1), second (Q2), and third Quartiles (Q3) respectively.
Quality Function Deployment (QFD): A structured methodology used to translate customer requirements into key process deliverables.
Random Sample: Data collected in a manner such that each element of the population has an equal chance of being selected for measurement.
Randomization: Randomization refers to the practice of using chance methods to assign subjects to treatments. In this way, the impact of unknown noise factors on the experimental results are minimized.
Regression: A method of analysis that quantifies the relationship between the response (Y) variable and one or more predictors (X). The relationship is represented by a formula that can be used to model process performance. If there is only one X and one Y, it is called simple regression. If there are multiple X’s and one Y, it is called multiple regression.
Replication: Application of a completed Six Sigma project to another area of the company. In a designed experiment, replication refers to assigning each treatment to multiple subjects.
Robust Process: A process that is resistant to defects or minimizes the impact of process, input or environmental variation on process performance.
Root Cause: The underlying condition or factor that has been validated and shown to produce one or more problems or defects.
Run Chart: A graphing tool that shows data plotted in time order for a process with some statistical tools applied to identify presence of special causes.
Sampling: The practice of gathering a subset of the total data available from a process or a population to draw conclusions about the process or population.
Scatter Plot: A graphing tool that shows the relationship between two variables typically used to determine if there is a correlation between the two variables.
Segmentation: The division of a group such as customers, markets etc. into smaller logical group for analysis.
Sigma: The statistical term used to represent the amount of variation present in the population. Not to be confused with Sigma level which is a measure of the process performance based on customer requirements.
SIPOC: A high-level process map that shows Suppliers, Inputs, Process, Outputs, and Customers. It is typically a one-page form that helps the team identify the big picture, key customers, requirements etc.
Skewness: It is a statistical measure of the symmetric nature of the distribution. If the distribution is symmetric, then skewness is 0. If one tail is longer than the other, we can call the distributions either skewed to the left or skewed to the right.
Special Cause Variation: Shifts in the process output caused by a specific factor. This shift may be due to non-random assignable causes.
Specification Limits: The Upper Specification Limit (USL) and Lower Specification Limit (LSL) are typically obtained from the customer and define whether the process and/or product performance is acceptable.
Standard Deviation: The statistical measure of the amount of variation present in the data. It is a numerical value that indicates how widely individual in a group vary from the group mean. If the standard deviation is big, then the variation is large.
Standard Score: It indicates how many standard deviations an element is from the mean value. The standard score can be calculated by subtracting the mean value and dividing by the standard deviation.
Standardization: Ensuring that once an improvement is made, people across the business units or departments follow the process with little or no modification.
Statistic: A statistic is a characteristic of a sample. It is used to estimate the value of the population parameter.
Statistical Process Control (SPC): The application of statistical methods (usually control charts) to analyze and control the variation in a process.
Stratifying Factor: A process characteristic used to divide data into subgroups to look for changes in process performance that might indicate the factor is a key driver of process performance.
Stratified Sampling: Stratified sampling refers to a type of sampling where the population is divided into separate groups (called strata) and a random sample is drawn from each group.
Subgroup: A subgroup is a group of units that are created under the same set of conditions. The measurements within a subgroup should be taken close together in time but still be independent of each other.
Treatment: In an experimental design, each independent variable is called a factor and each factor can take multiple levels. The combination of factor levels is called a treatment. For example, if we have two factors temperature (100, 150) and pressure (1.5, 2.0). Then an experiment that sets temperature at 100 and a pressure at 1.5 can be considered as one treatment.
Tree Diagram: A graphing tool used to subdivide a broad topic or goal into increasing levels of detail and represent the information into a logical hierarchy.
Type I Error: A Type I error occurs when we reject a null hypothesis when it is true. The probability of committing a Type I error is called the significance level and is often denoted by alpha.
Type II Error: A Type II error occurs when we accept a null hypothesis that is false. The probability of committing a Type II error is called Beta. The probability of not committing a Type II error is called the Power of the test.
Unit: Any item that is produced or processed.
Value Added Steps (VA): The steps in the production and delivery of a product or service that are essential to meet customer needs and requirements. Value added steps are considered steps that the customer is willing to pay for.
Variance Influence Factor (VIF): The VIF is a way to measure the amount of multicollinearity in a set of independent variables. If VIF of a variable is 1.0 then this variable is not correlated with any other independent variable. If the VIF is greater than 10, then we need to worry about multi-collinearity.
Variation: The fluctuation in a process measurement that can be quantified using range, standard deviation etc.
Vital Few: The small number of variables (X) most likely responsible for the majority of the variation in a product or process.
Voice of the Business (VOB): Information from internal company stakeholders about the expected and/or actual requirements related to the performance of a product or process.
Voice of the Customer (VOC): Information from customers obtained from a variety of sources about the expected and/or actual requirements related to the performance of a product or process.
Z: A statistical measure of a process performance also called the Sigma level that is the basis for determining process capability.
Follow us on LinkedIn to get the latest posts & updates.