README.md 1.4 KB
Newer Older
Nicolas Médoc's avatar
Nicolas Médoc committed
1
2
# MLForMultivariateData

Olivier Parisot's avatar
Olivier Parisot committed
3
Copyright 2019-2020 Luxembourg Institute of Science and Technology (LIST - http://www.list.lu/). 
Nicolas Médoc's avatar
Nicolas Médoc committed
4

Olivier Parisot's avatar
Olivier Parisot committed
5
Any use of this software constitutes full acceptance of all terms of the [software's license](./LICENSE.txt).
Nicolas Médoc's avatar
Nicolas Médoc committed
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23


## Overview

This project contains a R script building regression/classification models for multivariate data (e.g. Random Forest, XGBoost, KNN + other linear regression models such as GLM, LDA, QDA).
The models are evaluated through cross validation method with Accuracy and other metrics derived from confusion matrix, as well as with Root Mean Square Error.
Boxplots provide assessment of evaluation metrics measured during cross validation and allows to compare different configurations of models (different methods and parameters), different datasets and/or different input variables.

Variable importance is also measured during cross validation and shown in Boxplots for every model configuration. 


## Dependencies

The script needs the following packages:

* dplyr http://cran.r-project.org/web/packages/dplyr/index.html
* MASS http://cran.r-project.org/web/packages/MASS/index.html
* randomForest http://cran.r-project.org/web/packages/randomForest/index.html
Olivier Parisot's avatar
Olivier Parisot committed
24
25
26
27
28
* xgboost http://cran.r-project.org/web/packages/xgboost/index.html


## Contact

Olivier Parisot's avatar
Olivier Parisot committed
29
Any question? Please contact [Nicolas Médoc](mailto:nicolas.medoc@list.lu) or visit the [LIST website](https://www.list.lu/en/contact/).
Olivier Parisot's avatar
Olivier Parisot committed
30
31
32
33