Original link:tecdat.cn/?p=8502

Original source:Tuo End number according to the tribe public number

There are many ways to identify outliers, many different ways in R.

Articles on outlier methods combine theory with practice. The theory is fine, but outliers are outliers because they don’t follow the theory. An approach can be considered good if it finds outliers that we all agree on.

Outlier Overview (O3) diagrams are designed to help compare and understand the results of outlier methods.

O3 diagram of the Stackloss dataset. One row for each variable is combined (defined by the column on the left), the outliers are found, and one column for each case is identified as the outliers (the column on the right).

Wilkinson’s algorithm found six outliers for the entire dataset (the bottom row of the graph). Overall, 14 cases were found to be potential outliers for various combinations of variables.

O3plot, used to compare outliers for identifiers.

 

There are four other methods available in OutliersO3:

##    HDo    PCS    BAC adjOut    DDC    MCD
##    14      4      5      0      6      5
Copy the code

 

 

There are other exception methods in R that give more different results. Caution must be exercised. Outliers can be interesting in themselves, but they can also be misjudged.