Using Random Forests and Wordclouds to Visualize Feature Importance in Document Classification

What characteristics do the works of famous authors have that make them unique? This post uses ensemble methods and wordclouds to explore just that.

Project Gutenberg offers a large number of freely available works from many famous authors. The dataset for this post consists of books, taken from Project Gutenberg, written by each of the following authors:

  • Austen
  • Dickens
  • Dostoyevsky
  • Doyle
  • Dumas
  • Stevenson
  • Stoker
  • Tolstoy
  • Twain
  • Wells

Continue reading “Using Random Forests and Wordclouds to Visualize Feature Importance in Document Classification”

A Deep Learning Based AI for Path of Exile: A Series

This post is the first in a series on creating an AI for the game Path of Exile based on deep learning and other machine learning techniques. The goal of the project is to create an AI that operates based on visual input, is able to navigate levels successfully, can defend itself, and of course to have fun and learn something in the process.

Continue reading “A Deep Learning Based AI for Path of Exile: A Series”

Stock Market Prediction in Python Part 2

This post is part of a series on artificial neural networks (ANN) in TensorFlow and Python.

  1. Stock Market Prediction Using Multi-Layer Perceptrons With TensorFlow
  2. Stock Market Prediction in Python Part 2
  3. Visualizing Neural Network Performance on High-Dimensional Data
  4. Image Classification Using Convolutional Neural Networks in TensorFlow

This post revisits the problem of predicting stock prices based on historical stock data using TensorFlow that was explored in a previous post. In the previous post, stock price was predicted solely based on the date. First, the date was converted to a numerical value in LibreOffice, then the resulting integer value was read into a matrix using numpy. As stated in the post, this method was not meant to be indicative of how actual stock prediction is done. This post aims to slightly improve upon the previous model and explore new features in tensorflow and Anaconda python. The corresponding source code is available here.

Note: See a later post Visualizing Neural Network Performance on High-Dimensional Data for code to help visualize neural network learning and performance.

Continue reading “Stock Market Prediction in Python Part 2”

Multi-Layer Perceptrons and Back-Propagation; a Derivation and Implementation in Python

Artificial neural networks have regained popularity in machine learning circles with recent advances in deep learning. Deep learning techniques trace their origins back to the concept of back-propagation in multi-layer perceptron (MLP) networks, the topic of this post.

Multi-Layer Perceptron Networks for Regression

A MLP network consists of layers of artificial neurons connected by weighted edges. Neurons are denoted n_{ij} for the j-th neuron in the i-th layer of the MLP from left to right top to bottom. Inputs are fed into the leftmost layer and propagate through the network along weighted edges until reaching the final, or output, layer. An example of a MLP network can be seen below in Figure 1. Continue reading “Multi-Layer Perceptrons and Back-Propagation; a Derivation and Implementation in Python”

Eigenfaces versus Fisherfaces on the Faces94 Database with Scikit-Learn

In this post, two basic facial recognition techniques will be compared on the Faces94 database. Images from the Faces94 database are 180 by 200 pixels in resolution and were taken as the subjects were speaking to induce variations in the images. In order to train a classifier with the images, the raw pixel information is extracted, converted to grayscale, and flattened into vectors of dimension 180 \times 200 = 36000. For this experiment, 12 subjects will be used from the database with 20 files will be used per subject. Each subject is confined to a unique directory that contains only 20 image files. Continue reading “Eigenfaces versus Fisherfaces on the Faces94 Database with Scikit-Learn”