Applying Correlation as a Criterion in Hierarchical Decision Trees

Decision trees are a simple yet powerful method of machine learning. A binary tree is constructed in which the leaf nodes represent predictions. The internal nodes are decision points. Thus, paths from the root to the leafs represent sequences of decisions that result in an ultimate prediction.

Decision trees can also be used in hierarchical models. For instance, the leafs can instead represent subordinate models. Thus, a path from the root to a leaf node is a sequence of decisions that result in a prediction made by a subordinate model. The subordinate model is only responsible for predicting samples that fall within the leaf.

This post presents an approach for a hierarchical decision tree model with subordinate linear regression models.

Continue reading “Applying Correlation as a Criterion in Hierarchical Decision Trees”

Malware Detection and Classification using Logistic Regression

In this post, an approach to detecting malware using machine learning is presented. System call activity is processed and analyzed by a classification model to detect the presence of malicious applications.

Continue reading “Malware Detection and Classification using Logistic Regression”

A Statistical Analysis of Facial Attractiveness

An intermediate activation volume produced by a convolutional neural network predicting the attractiveness of a person.

Does beauty truly lie in the eye of its beholder? This chapter explores the complex array of factors that influence facial attractiveness to answer that question or at least to understand it better.

Continue reading “A Statistical Analysis of Facial Attractiveness”

Binary Classification with Artificial Neural Networks using Python and TensorFlow

This post is an introduction to using the TFANN module for classification problems. The TFANN module is available here on GitHub. The name TFANN is an abbreviation for TensorFlow Artificial Neural Network. TensorFlow is an open-source library for data flow programming. Due to the nature of computational graphs, using TensorFlow can be challenging at times. The TFANN module provides several classes that allow for interaction with the TensorFlow API using familiar object-oriented programming paradigms.

Continue reading “Binary Classification with Artificial Neural Networks using Python and TensorFlow”

Deep Learning OCR using TensorFlow and Python

In this post, deep learning neural networks are applied to the problem of optical character recognition (OCR) using Python and TensorFlow. This post makes use of TensorFlow and the convolutional neural network class available in the TFANN module. The full source code from this post is available here.

Continue reading “Deep Learning OCR using TensorFlow and Python”

PoE AI Part 5: Real-Time Obstacle and Enemy Detection using CNNs in TensorFlow

This post is the fifth part of a series on creating an AI for the game Path of Exile © (PoE).

  1. A Deep Learning Based AI for Path of Exile: A Series
  2. Calibrating a Projection Matrix for Path of Exile
  3. PoE AI Part 3: Movement and Navigation
  4. PoE AI Part 4: Real-Time Screen Capture and Plumbing
  5. AI Plays Path of Exile Part 5: Real-Time Obstacle and Enemy Detection using CNNs in TensorFlow

As discussed in the first post of this series, the AI program takes a screenshot of the game and uses it to form predictions that are then used to update its internal state. In this post, methods for classifying and organizing information from visual input of the game screen is discussed. I have made the source code for this project available on my GitHub. Enjoy!

Continue reading “PoE AI Part 5: Real-Time Obstacle and Enemy Detection using CNNs in TensorFlow”

Using Random Forests and Wordclouds to Visualize Feature Importance in Document Classification

What characteristics do the works of famous authors have that make them unique? This post uses ensemble methods and wordclouds to explore just that.

Project Gutenberg offers a large number of freely available works from many famous authors. The dataset for this post consists of books, taken from Project Gutenberg, written by each of the following authors:

  • Austen
  • Dickens
  • Dostoyevsky
  • Doyle
  • Dumas
  • Stevenson
  • Stoker
  • Tolstoy
  • Twain
  • Wells

Continue reading “Using Random Forests and Wordclouds to Visualize Feature Importance in Document Classification”