Eigenfaces versus Fisherfaces on the Faces94 Database with Scikit-Learn

In this post, two basic facial recognition techniques will be compared on the Faces94 database. Images from the Faces94 database are 180 by 200 pixels in resolution and were taken as the subjects were speaking to induce variations in the images. In order to train a classifier with the images, the raw pixel information is extracted, converted to grayscale, and flattened into vectors of dimension 180 \times 200 = 36000. For this experiment, 12 subjects will be used from the database with 20 files will be used per subject. Each subject is confined to a unique directory that contains only 20 image files. Continue reading “Eigenfaces versus Fisherfaces on the Faces94 Database with Scikit-Learn”

Wine Classification Using Linear Discriminant Analysis with Python and SciKit-Learn

In this post, a classifier is constructed which determines to which cultivar a specific wine sample belongs. Each sample consists a vector \textbf{v} of 13 attributes of the wine, that is \textbf{v} \in \mathbb{R}^{13}. The attributes are as follows:

  1. Alcohol
  2. Malic acid
  3. Ash
  4. Alcalinity of ash
  5. Magnesium
  6. Total phenols
  7. Flavanoids
  8. Nonflavanoid phenols
  9. Proanthocyanins
  10. Color intensity
  11. Hue
  12. OD280/OD315 of diluted wines
  13. Proline

Based on these attributes, the goal is to identify from which of three cultivars the data originated. The data set is available at the UCI Machine Learning Repository. Below are shown three sample rows from the data set. Continue reading “Wine Classification Using Linear Discriminant Analysis with Python and SciKit-Learn”

Breast Cancer Malignancy Classification using PCA and Least Squares with Scikit-Learn

In this post, a linear regression classifier is constructed for the purpose of offering a medical diagnosis regarding breast cytology. The classifier receives a vector \textbf{v} \in \mathbb{R}^9 whose 9 components correspond to the following measurements:

  1. Clump Thickness: 1 – 10
  2. Uniformity of Cell Size: 1 – 10
  3. Uniformity of Cell Shape: 1 – 10
  4. Marginal Adhesion: 1 – 10
  5. Single Epithelial Cell Size: 1 – 10
  6. Bare Nuclei: 1 – 10
  7. Bland Chromatin: 1 – 10
  8. Normal Nucleoli: 1 – 10
  9. Mitoses: 1 – 10

Given a vector of measurements, the classifier determines if the cells are benign or malignant. The data used in this post is courtesy of UCI’s machine learning repository and is available here. Continue reading “Breast Cancer Malignancy Classification using PCA and Least Squares with Scikit-Learn”