We created this visualization in D3 to show how the placement of faculty can affect the spread of research ideas in science. Kenzie Weller did all the development and won a Data Visualization Contest!
We released an open-source dataset of the paid parental leave policies for 205 US and Canadian universities. The accompanying website features a neat visualization in D3 and some preliminary analyses for those interested. This was joint work with Sam F. Way, Aaron Clauset, Dan Larremore, and Mirta Galesic.
Building off of our baking competition, I organized a salsa contest this semester. Sixteen dips were entered, and two won - best mild and hot. To prepare for the competition, I attempted to make a data-driven salsa by mining salsa recipes online and looking for ingredients that are predictive of highly rated recipes. Details of this analysis and the supplementary code, can be found here.
Eight participants contributed a batch of cookies. Each cookie was judged by thirteen critics according to four different categories: most creative, best looking, best tasting, and best texture. While there are clear standouts shown in the histograms below, they were all good cookies. Code to generate these figures can be found here.
Inspired by Seth Bling’s (super awesome) MarI/O, we implemented Kenneth Stanley and Risto Miikkulainen’s NeuroEvolution of Augmenting Topologies (NEAT) algorithm to try to win Flappy Bird. NEAT is a genetic algorithm for evolving neural networks, and it relies on three key principles: (1) tracking the history of genes to determine suitable networks to mate (called crossover), (2) evolving successful networks further (called speciation), and (3) starting from the simplest neural networks possible and complexifying only out of necessity.
Unlike Super Mario World, Flappy Bird navigates a randomly determined playing field, posing an interesting challenge for NEAT. Our code was built on top of an existing Flappy Bird pygame, and that’s about it. We’ve implemented our own neural networks. If you’d like to learn more, here is a video of our final presentation.
Often in document classification, a document may have more than one relevant classification – a question on stackoverflow might have tags go, map, and interface. While multinomial Bayesian classification offers a one-of-many classification, multibayes offers tools for many-of-many classification.
A simple use case for our naive Bayesian classifier is decribed in “Catching Clickbait: Using a Naive Bayesian Classifier in Go”. Inspired by Paul Graham’s “Plan for Spam”, I scraped 10,000 headlines to train our classifier to recognize clickbait (e.g. “17 Facts You Won’t Believe Are True”, “18 Pugs Who Demand To Be Taken Seriously”, etc). At the end of this article is a fun interactive classifier where you can find posterior probability of a new headline being clickbait.