|
|
We have developed dprep
an R package for data preprocessing including normalization, discretization,
handling of missing values, outlier detection, normalization,
feature selection and visualization for large datasets. Algorithms
for instance selection and building of a GUI will be included
soon.
People: Edgar Acuna, Caroline Rodriguez, Luis
Daza, and Sindy Diaz.
Implementation of parallel algorithms for several knowledge
discovery/data mining tasks among them: outlier detection, visualization,
computation of nonparametric classifiers, computation of metaclassifiers,
and computation of clustering methods.
People:Roxana Aparicio, and Edgar Acuna.
|
|
We are extending several methodologies for improving the classification
of gene expression data. These methodologies include: i) Generalizations
of Partial Least Squares by using a nonparametric classifier instead
of linear regression in the outer step aiming a dimensionality
reduction in supervised classification. This method will improve
PLS discriminant and PLS Logistic.
ii) Extensions of supervised principal components for regression
to classification problems.
iii) Building classifiers obtained by application to logistic
regression of some shrinkage estimators such as the lasso and
the garrote to logistic regression.
People: Edgar Acuna and Luz Marina Muniz.
Collaborators: Ana Patricia Ortiz (Puerto Rico
Cancer Center), Jose Vega (University of Puerto Rico School of Medicine,
and Idhaliz Flores, Ponce medical School, Puerto Rico.
|
|
We are investigating several applications of Rough sets theory
to Knowledge Discovery tasks including: Discretization, Imputation,
Feature Selection in supervised classification. Emphasis will
be given to datasets coming from Bioinformatics.
People: Edgar Acuna and Frida Coaquira.
We are searching for efficient cluster validation methods.
People: Edgar Acuna and Roxana Aparicio.
> |
|
We are working on extensions of bayesian networks for datasets
containing mixed type of features. Also we are looking for applications
of bayesian networks on multirelational data mining.
People: Edgar Acuna.
Data mining algorithms look for patterns in data.
While most existing data mining approaches look for patterns in
a single data table, multi-relational data mining approaches look
for patterns that involve multiple tables from a relational database.
We are looking for extensions of data mining tasks to the multi-relational
case.
People: Trilce Encarnacion, Karen Aparicio, and
Edgar Acuna.
|
|