## What is the formula to determine if data have outliners

Question

What is the formula to determine if data have outliners

in progress 0
2 weeks 2021-09-10T19:23:53+00:00 1 Answer 0

All this info can be found on wiki how

Arrange all data points from lowest to highest. The first step when calculating outliers in a data set is to find the median (middle) value of the data set. This task is greatly simplified if the values in the data set are arranged in order of least to greatest. So, before continuing, sort the values in your data set in this fashion.Calculate the median of the data set. The median of a data set is the data point above which half of the data sits and below which half of the data sits – essentially, it’s the “middle” point in a data set.[3] If the data set contains an odd number of points, this is easy to find – the median is the point which has the same number of points above as below it. However, if there are an even number of points, then, since there is no single middle point, the 2 middle points should be averaged to find the median. Note that, when calculating outliers, the median is usually assigned the variable Q2 – – this is because it lies between Q1 and Q3, the lower and upper quartiles, which we will define later.Don’t be confused by data sets with even numbers of points – the average of the two middle points will often be a number that doesn’t appear in the data set itself – this is OK. However, if the two middle points are the same number, the average, obviously, will be this number as well, which is also OK.Calculate the lower quartile. This point, to which we will assign the variable Q1, is the data point below which 25 percent (or one quarter) of the observations set. In other words, this is the halfway point of the points in your data set below the median. If there are an even number of values below the median, you once again must average the two middle values to find Q1, much like you may have had to do to find the median itself.Calculate the upper quartile. This point, which is assigned the variable Q3, is the data point above which 25 percent of the data sits. Finding Q3 is almost identical to finding Q1, except that, in this case, the points above the median, rather than below it, are taken into account.Find the interquartile range. Now that we’ve defined Q1 and Q3, we need to calculate the distance between these two variables. The distance from Q1 to Q3 is found by subtracting Q1 from Q3. The value you obtain for the interquartile range is vital for determining the boundaries for non-outlier points in your data set.Find the “inner fences” for the data set. Outliers are identified by assessing whether or not they fall within a set of numerical boundaries called “inner fences” and “outer fences”.[4] A point that falls outside the data set’s inner fences is classified as a minor outlier, while one that falls outside the outer fences is classified as a major outlier. To find the inner fences for your data set, first, multiply the interquartile range by 1.5. Then, add the result to Q3 and subtract it from Q1. The two resulting values are the boundaries of your data set’s inner fences.