An outlier may be a piece of knowledge that’s an abnormal distance from other points. In other words, it’s data that lies outside the opposite values within the set. If you had Pinocchio during a class of youngsters , the length of his nose compared to the opposite children would be an outlier.

In this set of random numbers, 1 and 201 are outliers:

1, 99, 100, 101, 103, 109, 110, 201

“1” is a particularly low value and “201” is a particularly high value.

Outliers aren’t always that obvious. Let’s say you received the subsequent paychecks last month:

$225, $250, $25, $235.

Your average paycheck is $135. But that tiny paycheck ($25) could be because you went on vacation, so a weekly paycheck average of $135 isn’t a real reflection of what proportion you earned. Yoru average is really closer to $237 if you’re taking the outlier ($25) out of the set.

Of course, trying to seek out outliers isn’t always that straightforward. Your data set may appear as if this:

61, 10, 32, 19, 22, 29, 36, 14, 49, 3.

You could take a guess that 3 could be an outlier and maybe 61. But you’d be wrong: 61 is that the only outlier during this data set.

A box and whiskers chart (boxplot) often shows outliers:

The outlier on this boxplot is outside of the box and whiskers.
Box and whiskers chart that includes outliers in the whiskers.

Therefore, don’t believe finding outliers from a box and whiskers chart. That said, box and whiskers charts are often a useful gizmo to display them after you’ve got calculated what your outliers actually are. The foremost effective thanks to find all of your outliers are by using the interquartile range (IQR). The IQR contains the center bulk of your data, so outliers are often easily found once you recognize the IQR.

How to Find Outliers Using the Interquartile Range(IQR)

Frequency chart with boxplot at the top. The outliers are shown as dots outside the range of the whiskers.

An outlier is defined as being any point of knowledge that lies over 1.5 IQRs below the primary quartile (Q1) or above the third quartile (Q3)in a knowledge set.

High = (Q3) + 1.5 IQR

Low = (Q1) – 1.5 IQR

Sample Question: Find the outliers for the subsequent data set: 3, 10, 14, 22, 19, 29, 70, 49, 36, 32.

Step 1: Find the IQR, Q1(25th percentile) and Q3(75th percentile). Use our online interquartile range calculator to seek out the IQR or if you would like to calculate it by hand, follow the steps during this article: Interquartile home in Statistics: the way to find it.

IQR = 22

Q1 = 14

Q3 = 36

Step 2: Multiply the IQR you found in Step 1 by 1.5:

IQR * 1.5 = 22 * 1.5 = 33.

Step 3: Add the quantity you found in Step 2 to Q3 from Step 1:

33 + 36 = 69.

This is your upper limit. Set this number aside for a flash .

Step 3: Subtract the quantity you found in Step 2 from Q1 from Step 1:

14 – 33 = -19.

This is your lower limit. Set this number aside for a flash.

Step 5: Put the numbers from your data set in order:

3, 10, 14, 19, 22, 29, 32, 36, 49, 70

Step 6: Insert your low and high values into your data set, in order:

-19, 3, 10, 14, 19, 22, 29, 32, 36, 49, 69, 70

Step 6: Highlight any number below or above the numbers you inserted in Step 6:

-19, 3, 10, 14, 19, 22, 29, 32, 36, 49, 69, 70

How to Find Outliers with the the Tukey Method

The Tukey method for locating outliers uses the interquartile range to filter very large or very small numbers. It’s practically an equivalent because the procedure above, but you would possibly see the formulas written slightly differently and therefore the terminology may be a little different also. For instance , the Tukey method uses the concept of “fences”.

The formulas are:

Low outliers = Q1 – 1.5(Q3 – Q1) = Q1 – 1.5(IQR)

High outliers = Q3 + 1.5(Q3 – Q1) = Q3 + 1.5(IQR)

Where:

Q1 = first quartile

Q3 = third quartile

IQR = Interquartile range

These equations offer you two values, or “fences“. You’ll consider them as a fence that cordons off the outliers from all of the values that are contained within the bulk of the info .

Sample question: Use Tukey’s method to seek out outliers for the subsequent set of data: 1,2,5,6,7,9,12,15,18,19,38.

Step 1: Find the Interquartile range:

Find the median: 1,2,5,6,7,9,12,15,18,19,38.

Place parentheses round the numbers above and below the median — it makes Q1 and Q3 easier to seek out .

(1,2,5,6,7),9,(12,15,18,19,38)

Find Q1 and Q3. Q1 are often thought of as a median within the lower half the info. Q3 are often thought of as a median for the upper half data.

(1,2,5,6,7), 9, ( 12,15,18,19,38). Q1=5 and Q3=18.

Subtract Q1 from Q3. 18-5=13.

Step 2: Calculate 1.5 * IQR:

1.5 * IQR = 1.5 * 13 = 19.5

Step 3: Subtract from Q1 to urge your lower fence:

5 – 19.5 = -14.5

Step 4: increase Q3 to urge your upper fence:

18 + 19.5 = 37.5.

Step 5:Add your fences to your data to spot outliers:

(-14.5) 1,2,5,6,7,9,12,15,18,19,(37.5),38.

Anything outside of the fences is an outlier. For this data set, 38 is that the only outlier.