Help Manual

Contents






Sigma Magic Help Version 17

Histogram

Overview

Histograms can be used to compare different sets of data. Histograms are similar to Dot Plots except that data points are displayed in groups rather than separately as individual data points. If you have less than 50 data points, using Dot Plots instead of Histograms may be better. Histograms can help visualize which data values occur the most often, give us an idea of the minimum and maximum values, and help determine the nature of the data distribution.

This tool can be added to your active workbook by clicking on Graph and then selecting Histogram.

Inputs

Click on Analysis Setup to open the menu options for this tool.

Setup

A sample screenshot of the setup menu is shown below.
inputs
1
Data Type: Specify the type of data for this analysis.
OptionDescription
ContinuousUse this option for continuous data. Continuous data can have almost any numeric value and can be meaningfully subdivided into finer and finer increments, depending upon the precision of the measurement system.
DiscreteUse this option is your data is discrete. Discrete data contains finite values - for example, binary, categorical data.
2
Multiple Graphs: When your data contains multiple groups, specify how you want to plot the histograms. The available options are:
OptionDescription
DifferentPlot each histogram on a separate chart. Note that if you specify the bins as Auto, each histogram may have a different scale. Specify the bins manually if you want to use the same scale for all charts.
SamePlot all the histograms with the same grouping on the same chart. All charts, by default, use the same scale.
3
Graph Type: Specify the type of histogram you want to create. The available options are:
OptionDescription
FitOnly plot the distribution fit on the plot. Do not plot the histogram of the data points.
HistogramOnly plot the histogram on the plot. Do not superimpose the distribution fit on this plot.
Histogram and FitPlot both the histogram and the best-fit distribution on the plot.
4
Fit Distribution: Specify the name of the distribution you want to superimpose on the histogram. The available options are:
OptionDescription
NoneNo distribution has been specified for this histogram.
BetaPlot a beta distribution on the histogram.
CauchyPlot a Cauchy distribution on the histogram.
Chi-SquaredPlot a Chi-Squared distribution on the histogram.
ErlangPlot an Erlang distribution on the histogram.
Extreme ValuePlot an Extreme Value distribution on the histogram.
FPlot a F distribution on the histogram.
GammaPlot a Gamma distribution on the histogram.
LaplacePlot a Laplace distribution on the histogram.
Log NormalPlot a Log Normal distribution on the histogram.
LogisticPlot a Logistic distribution on the histogram.
Log LogisticPlot a Log Logistic distribution on the histogram.
NormalPlot a Normal distribution on the histogram.
ParetoPlot a Pareto distribution on the histogram.
PertPlot a Pert distribution on the histogram.
PowerPlot a Power distribution on the histogram.
RayleighPlot a Rayleigh distribution on the histogram.
TPlot a T distribution on the histogram.
TriangularPlot a Triangular distribution on the histogram.
UniformPlot an Uniform distribution on the histogram.
WeibullPlot a Weibull distribution on the histogram.
5
Help Button: Click on this button to open the help file for this topic.
6
Cancel Button: Click on this button to discard any changes and close the dialog box.
7
OK Button: If possible, click this button to save any changes and compute the analysis outputs.

Data

You will see the following dialog box if you click the Data button. Here, you can specify the data required for this analysis. Data
1
Search Data: The available data displays all the columns of data that are available for analysis. You can use the search bar to filter this list and speed up finding the right data for analysis. Enter a few characters in the search field, and the software will filter and display the filtered data in the Available Data box.
2
Available Data: The available data box contains the list of data available for analysis. If your workbook has no data in tabular format, this box will display "No Data Found." The information displayed in this box includes the row number, whether the data is Numeric (N) or Text (T), and the name of the column variable. Note that the software displays data from all the tables in the current workbook. Even though data within the same table have unique column names, columns across different tables can have similar names. Hence, you must specify the column name and the table name.
3
Add or View Data: Click on this button to add more data to your workbook for analysis or to view more details about the data listed in the available data box. When you click on this button, it opens the Data Editor dialog box, where you can import more data into your workbook. You can also switch from the list view to a table view to see the individual data values for each column.
4
Required Data: The code for the required data specifies what data can be specified for that box. An example code is N: 2-4. If the code starts with an N, you must select only numeric columns. If the code begins with a T, you can select numeric and text columns. The numbers to the right of the colon specify the min-max values. For example, if the min-max values are 2-4, you must select a minimum of 2 columns of data and a maximum of 4 columns in this box. If the minimum value is 0, then no data is required to be specified for this box.
5
Select Button: Click on this button to select the data for analysis. Any data you choose for the analysis is moved to the right. To select a column, click on the columns in the Available Databox to highlight them and then click on the Select Button. A second method to choose the data is to double-click on the columns in the list of Available Data. Finally, you can drag and drop the columns you are interested in by holding down the select columns using your left mouse key and dragging and dropping them in one of the boxes on the right.
6
Selected Data: The list box header will be displayed in black if the right number of data columns is specified. If sufficient data has not been specified, then the list box header will be displayed in red color. Note that you can double-click on any of the columns in this box to remove them from the box.
6a
Analysis Variables: This list box contains the data used to create the histogram plot. This list box is mandatory; at least one column must be specified. Note that the values specified in this column must be numeric. If multiple columns are specified, then the histogram plot is created using the data in each column.
6b
Categorical Variables: This list box contains the categories to use to create the histogram plot. It is not mandatory and can contain either numeric or text data. Note that we can specify up to two categories. If data has been selected for this list box, then the groups specified here are used to split the analysis variables into multiple data sets, and a histogram plot is created for each group. All the groups identified here will be plotted on the same chart.
6c
By Variable: This list box contains the categories to use to create the histogram plot. It is not mandatory and can contain either numeric or text data. Note that we can specify up to one column here. If data has been selected for this list box, then the groups specified here are used to split the analysis variables into multiple data sets, and a histogram plot is created for each group. All the groups identified here will be plotted on separate charts.
7
View Selection: Click on this button to view the data specified for this analysis. The data can be viewed in a tabular format or a graphical summary.

Bins

You will see the following dialog box if you click the Charts button. Charts
1
Bin Algorithm: Specify the algorithm for using the bin sizes for the histogram. The shape of the histograms is dependent on the number of bins, and it is important that you correctly determine the right number of bins to use for your graph.
OptionDescription
Auto The software will determine the most appropriate bin size to use for this chart. You cannot override the minimum and maximum values for this setting. The values are determined during execution to select the best values suitable for the given data set.
Freedman-DiaconisThe software will determine the bin width based on the Freedman-Diaconis algorithm. You can override the minimum and maximum values if required.
ManualSpecify the minimum, maximum, and bin width. Here, you have maximum control over the settings to plot the histogram. The number of bins is auto-calculated based on the values you enter.
Shimazaki-ShinomotoThe software will determine the bin width based on the Shimazaki-Shinomoto algorithm. You can override the minimum and maximum values if required.
Square RootThe software will determine the bin width based on the Square Root algorithm. You can override the minimum and maximum values if required.
SturgesThe software will determine the bin width based on the Sturges algorithm. You can override the minimum and maximum values if required.
Terrell-ScottThe software will determine the bin width based on the Terrell-Scott algorithm. You can override the minimum and maximum values if required.
2
Bin Width: Specify the width of each bin. This can only be specified if the Bin Algorithm is set to Manual; otherwise, it will automatically be determined for you.
3
Min Value: Specify the minimum value for the histogram.
4
Max Value: Specify the maximum value for the histogram.
5
Import Data: If you have changed your data and want to reload the values to compute the bin sizes, click the Import Data button. This will read your input data values and recompute the min, max, and bin width values.
6
Bin Summary: Based on the settings you have specified above, the number of bins is displayed in the bin summary. Ensure the number of bins is reasonable - neither too large nor too small. The ideal number of bins is between 10 and 50.

Charts

You will see the following dialog box if you click the Charts button. Charts
1
Title: The system will automatically pick a title for your chart. However, if you want to override that with your title, you can specify a title for your chart here. Note that this input is optional.
2
Sub Title: The system will automatically pick a subtitle for your chart. However, if you want to override that with your subtitle, specify a subtitle for your chart here. Note that this input is optional.
3
X Label: The system will automatically pick a label for the x-axis. However, if you would like to override that with your label for the x-axis, you can specify a different label here. Note that this input is optional.
4
Y Label: The system will automatically pick a label for the y-axis. However, if you would like to override that with your label for the y-axis, you can specify a different label here. Note that this input is optional.
5
X Axis: The system will automatically pick a scale for the x-axis. However, if you would like to override that with your values for the x-axis, you can specify them here. The format for this input is to specify the minimum, increment, and maximum values separated by a semi-colon. For example, if you specify 10;20, the minimum x-axis scale is set at 10, and the maximum x-axis scale is set at 20. If you specify 10;2;20, then, in addition to minimum and maximum values, the x-axis increment is set at 2. Note that this input is currently disabled, and you cannot change this setting.
6
Y Axis: The system will automatically pick a scale for the y-axis. However, if you would like to override that with your values for the y-axis, you can specify them here. The format for this input is to specify the minimum, increment, and maximum values separated by a semi-colon. For example, if you specify 10;20, the minimum y-axis scale is set at 10, and the maximum y-axis is set at 20. If you specify 10;2;20, then, in addition to minimum and maximum values, the y-axis increment is set at 2. Note that this input is optional.
7
Horizontal Lines: You can specify the values here if you want to add a few extra horizontal reference lines on top of your chart. The format for this input is numeric values separated by semi-colon. For example, if you specify 12;15, two horizontal lines are plotted at Y = 12 and Y = 15, respectively. Note that this input is optional.
8
Vertical Lines: You can specify the values here if you want to add a few extra vertical reference lines on top of your chart. The format for this input is numeric values separated by semi-colon. For example, if you specify 2;5, two vertical lines are plotted at X = 2 and X = 5, respectively. Note that this input is optional.

Verify

If you click the Verify button, the software will perform some checks on the data you entered. A sample screenshot of the dialog box is shown in the figure below. Verify The software checks if you have correctly specified the input options and entered the required data on the worksheet. The results of the analysis checks are listed on the right. If the checks are passed, they are shown as green-colored checkmarks. If the verification checks fail, they are shown as a red-colored cross. If the verification checks result in a warning, they are shown in the orange exclamation mark, and finally, any checks that are required to be performed by the user are shown as blue info icons.
1
Item: The left-hand side shows the major tabs and the items checked within each section
2
Status: The right-hand side shows the status of the checks.
3
Overall Status: The overall status of all the checks for the given analysis is shown here. The overall status check shows a green thumps-up sign if everything is okay and a red thumps-down sign if any checks have not passed. Note that you cannot proceed with generating analysis results for some analyses if the overall status is not okay.

Outputs

Click on Compute Outputs to update the output calculations. A sample screenshot of the worksheet is shown below. outputs
>
Notes: The text output of the analysis contains a summary of the inputs - specifically the type of data (continuous or discrete), the type of fit to display on the histogram, and whether the bins are calculated automatically or manually. If the bins are calculated manually, the parameters to determine the bins are also listed in this section. The analysis results contain the names of each group, the mean and median values, the min and max values, the first and third quartiles, the interquartile range, and the range and standard deviation of the data points.

>
Graphs: The graph section shows each group's histogram plot along with a superimposed curve fit. You can compare different groups of data to see if the Histograms are relatively similar for each group. If a curve fit is superimposed on the histogram, then you can compare the histogram to the curve fit to see if there is a close match between the two.

Notes

Here are a few pointers regarding this analysis:
  • Since we are using multiple axes for plotting the Histograms. If you change one of the axes, your Histogram may not appear properly on the graph. You will have to proportionately scale both the Histogram axes if you want to change the axes' limits.

Examples

The following examples are in the Examples folder.
  • Create a histogram for the strength of a component provided by two suppliers (Company Weekly Revenue.xlsx)
  • Create a histogram for the revenue generated by a company (Company Weekly Revenue.xlsx)
  • Create a histogram for the unemployment rate of a developed nation collected from 1980 to 2013. Is the data normally distributed? What conclusions can you draw from this data set? (Unemployment Rate.xlsx)
  • Create a histogram of the city's high temperatures - Refer to Q1 on the problems tab. What conclusions can you draw from this data? (City Weather.xlsx)



© Rapid Sigma Solutions LLP. All rights reserved.



We value your privacy

We use cookies to enhance your browsing experience and serve you personalized content. By clicking "Accept All", you consent to your use of cookies.