| Type: | Package | 
| Title: | Refined Modified Stahel-Donoho (MSD) Estimators for Outlier Detection (Parallel Version) | 
| Version: | 0.1.1 | 
| Suggests: | testthat (≥ 3.0.0) | 
| Depends: | R (≥ 2.10), stats | 
| Imports: | parallel, doParallel, foreach | 
| Description: | A parallel function for multivariate outlier detection named modified Stahel-Donoho estimators is contained in this package. The function RMSDp() is for elliptically distributed datasets and recognizes outliers based on Mahalanobis distance. This function is for higher dimensional datasets that cannot be handled by a single core function RMSD() included in 'RMSD' package. See Wada and Tsubaki (2013) <doi:10.1109/CLOUDCOM-ASIA.2013.86> for the detail of the algorithm. | 
| License: | GPL (≥ 3) | 
| Encoding: | UTF-8 | 
| Language: | en-US | 
| RoxygenNote: | 7.3.1 | 
| Config/testthat/edition: | 3 | 
| LazyData: | true | 
| NeedsCompilation: | no | 
| Packaged: | 2024-06-10 13:48:49 UTC; wada | 
| Author: | Kazumi Wada  | 
| Maintainer: | Kazumi Wada <kazwd2008@gmail.com> | 
| Repository: | CRAN | 
| Date/Publication: | 2024-06-12 21:00:21 UTC | 
Modified Stahel-Donoho Estimators (parallel version)
Description
This function is for multivariate outlier detection. version 0.0.1 2013/06/15 Related paper: DOI: 10.1109/CLOUDCOM-ASIA.2013.86 version 0.0.2 2021/11/15 Outlier detection step added version 0.0.3 2022/08/12 Bug fixed about Random seed setting
Usage
RMSDp(inp, cores = 0, nb = 0, sd = 0, pt = 0.999, dv = 10000)
Arguments
inp | 
 input data (a numeric matrix)  | 
cores | 
 number of cores used for this function  | 
nb | 
 number of basis  | 
sd | 
 seed (for reproducibility)  | 
pt | 
 threshold for outlier detection (probability)  | 
dv | 
 maximum number of elements processed together on the same core  | 
Value
a list of the following information
u final mean vector
V final covariance matrix
wt final weights
mah squared squared Mahalanobis distances
cf threshold to detect outlier (percentile point)
ot outlier flag (1:normal observation, 2:outlier)
Wine dataset in UCI Machine Learning Repository
Description
A subset of data from the World Health Organization Global Tuberculosis Report ...
Usage
wine
Format
## 'wine' A data frame with 178 rows and 13 columns:
- Alcohol
 Alcohol
- Malic acid
 Malic acid
- Ash
 Ash
- Alcalinity of ash
 Alcalinity of ash
- Magnesium
 Magnesium
- Total phenols
 Total phenols
- Flavonoids
 Flavonoids
- Nonflavanoid phenols
 Nonflavanoid phenols
- Proanthocyanins
 Proanthocyanins
- Color intensity
 Color intensity
- Hue
 Hue
- OD280/OD315 of diluted wines
 OD280/OD315 of diluted wines
- Proline
 Proline
Source
<https://archive.ics.uci.edu/dataset/109/wine>