Box plots, also known as box-and-whisker plots are used to compare the distributions of different groups. The range of the box indicates the interquartile range of the data and the central line denotes the median (i.e. five-number summary).
Technical details
Data/Inputs
Variable(s) or a table. Note that if you use a table, each row in the table will be treated as a separate observation to calculate the median, percentiles, etc of the distribution. So if you want to use a table instead of a variable as the input, you'll want the table to have the raw data you need to calculate the box.
The following is an explanation of the options available in the Object Inspector for this specific visualization. Refer to Visualization Options for general chart formatting options.
Chart
APPEARANCE
Plot vertically Rotates the bean plot by 90 degrees.
Data value color Sets the value of the dashes in the rug.
Box points Treatment of outliers:
All plots all of the values, next to the boxes.
Suspected outliers plots all the unique values that appear outside the whiskers, using hollow circles for points less than 3 times the interquartile range from the 1st and 3rd quartiles.
Outliers plots all the unique values that appear outside the whiskers.
Note that the quantiles and median shown in the box plot can differ from those shown in the violin plot. Quantiles in the box plot are computed using the midpoints between the steps of the empirical cdf as knots. This is equivalent to quantile(x, type = 5) in R, or Method 10 in http://jse.amstat.org/v14n3/langford.html.
Output
The example below uses data from a fast-food tracking study. The plot shows the distribution of fast food consumed for different age groups.
Acknowledgements
The density is computed using the base R density function and the plot is created using plotly.
Method
- In Displayr: How to Create a Box Plot
- In Q: How to Create a Box Plot in Q