Visualizing the Spread of Ideas in Science • FEB 2019

We created this visualization in D3 to show how the placement of faculty can affect the spread of research ideas in science. Kenzie Weller did all the development and won a Data Visualization Contest!

Paid Parental Leave Policies at US and Canadian Universities • APR 2018

We released an open-source dataset of the paid parental leave policies for 205 US and Canadian universities. The accompanying website features a neat visualization in D3 and some preliminary analyses for those interested. This was joint work with Sam F. Way, Aaron Clauset, Dan Larremore, and Mirta Galesic.

Salsa Competition • MAR 2018

Building off of our baking competition, I organized a salsa contest this semester. Sixteen dips were entered, and two won - best mild and hot. To prepare for the competition, I attempted to make a data-driven salsa by mining salsa recipes online and looking for ingredients that are predictive of highly rated recipes. Details of this analysis and the supplementary code, can be found here.

“The Great Boulder Bakeoff” • DEC 2017

Eight participants contributed a batch of cookies. Each cookie was judged by thirteen critics according to four different categories: most creative, best looking, best tasting, and best texture. While there are clear standouts shown in the histograms below, they were all good cookies. Code to generate these figures can be found here.

Prediction in Projection Using Google Search Trends • MAY 2017

Using state-space reconstruction on a scalar time series, can Google search trends be predicted? Check out the Github repository or the project write-up.

Flappy BI/O • NOV-DEC 2016

Inspired by Seth Bling’s (super awesome) MarI/O, we implemented Kenneth Stanley and Risto Miikkulainen’s NeuroEvolution of Augmenting Topologies (NEAT) algorithm to try to win Flappy Bird. NEAT is a genetic algorithm for evolving neural networks, and it relies on three key principles: (1) tracking the history of genes to determine suitable networks to mate (called crossover), (2) evolving successful networks further (called speciation), and (3) starting from the simplest neural networks possible and complexifying only out of necessity.

Unlike Super Mario World, Flappy Bird navigates a randomly determined playing field, posing an interesting challenge for NEAT. Our code was built on top of an existing Flappy Bird pygame, and that’s about it. We’ve implemented our own neural networks. If you’d like to learn more, here is a video of our final presentation.

Anomalyzer • SEP-NOV 2014

Probabilistic anomaly detection for time series written in Go. Blog post about the work featured on front-page of Hacker News August 13th.

Multiclass Naive Bayesian Classification • NOV-DEC 2014

Often in document classification, a document may have more than one relevant classification – a question on stackoverflow might have tags go, map, and interface. While multinomial Bayesian classification offers a one-of-many classification, multibayes offers tools for many-of-many classification.

A simple use case for our naive Bayesian classifier is decribed in “Catching Clickbait: Using a Naive Bayesian Classifier in Go”. Inspired by Paul Graham’s “Plan for Spam”, I scraped 10,000 headlines to train our classifier to recognize clickbait (e.g. “17 Facts You Won’t Believe Are True”, “18 Pugs Who Demand To Be Taken Seriously”, etc). At the end of this article is a fun interactive classifier where you can find posterior probability of a new headline being clickbait.

Musical Staircase • APR-MAY 2013

Built a musical staircase using an Arduino Uno and 16 pairs of lasers and photoresistors. Featured in Reed Magazine. Here is a video of the staircase in action.