Decorrelating Features using the Gram-Schmidt Process

A problem that frequently arises when applying linear models is that of multicollinearity. The term multicollinearity describes the phenomenon where one or more features in the data matrix can be accurately predicted using a linear model involving others of the features. The consequences of multicollinearity include numerical instability due to ill-conditioning, and difficulty in interpreting the regression coefficients. An approach to decorrelate features is presented using the Gram-Schmidt process.

Continue reading “Decorrelating Features using the Gram-Schmidt Process”

Advertisements

On the Analysis and Prediction of Recessions in the USA

This chapter explores recessions in the United States of America. Datasets are collected from a variety of locations including the Federal Reserve Economic Data (FRED) and from the website of Yale professor and Nobel laureate Dr. Robert J. Shiller. A classifier model is constructed which predicts recessions and this model is analyzed for useful insights.

Continue reading “On the Analysis and Prediction of Recessions in the USA”

A Brief Analysis of Survey Data from a Speed Dating Event

In this post, survey data collected from several speed dating events is analyzed. The events were conducted between 2002 and 2004 by two professors from Columbia University: Ray Fisman and Sheena Iyengar. In addition to questions about personal interests, the survey includes academic and occupational questions as well.

Continue reading “A Brief Analysis of Survey Data from a Speed Dating Event”

Mortality in the United States and Its Causes

In this chapter, vital statistics for the United States of America are explored. The Center for Disease Control maintains several datasets containing vital statistics for the nation. These datasets contain records of deaths organized by year. Each record includes age, gender, race, cause of death, and other details. This chapter explores data for the year 2016.

Continue reading “Mortality in the United States and Its Causes”

A Statistical Analysis of Facial Attractiveness

An intermediate activation volume produced by a convolutional neural network predicting the attractiveness of a person.


Does beauty truly lie in the eye of its beholder? This chapter explores the complex array of factors that influence facial attractiveness to answer that question or at least to understand it better.

Continue reading “A Statistical Analysis of Facial Attractiveness”