Feature Generation and Extraction
Next, I extracted bottleneck features from each phenotype and used those features in machine learning. The bottleneck features were obtained with Inception Model v3 – a pre-trained neural network for TensorFlow.
Feature Reduction
Principle Component Analysis
There are approximately 2050 extracted features from the pre-trained neural network. Not all these features will be useful in creating a classifier and will slow training. Thus, reducing the number of bottleneck features fed into the algorithm is essential. In principle component analysis (PCA) features are transformed into another feature space and ordered according to variability – think eigenvectors and eigenvalues. One can use PCA to reduce the number of features (independent variables) to a set of explanatory variables to be used for algorithm training and classification.
The chart Explained Variance of Principle Component indicates most variance is contained in less than 50 principle components. Therefore, use of 50 or fewer components should not interfere with algorithm performance and speed up training.
