"Data Science"


  • Natural Language Processing / Speech Technology

Here's a speaker recognition challenge that I did for ava.me, a startup working on a captioning app for the hearing impaired.

For the Coursera Data Science Specialization's Capstone Project, I used text corpora to build a model that predicts a word based on sentential context. Here's the app, a brief presentation and the R code.



  • Machine Learning

I have thoroughly enjoyed the soup to nuts approach taken by the Coursera Machine Learning Specialization so far, especially the math behind algorithms like adaboost and stochastic gradient ascent.

In addition to my day job, I am also a part-time Kaggler and am thinking about trying this UN Millenium Goals challenge next.



  • Demos and interests

Here's a brief tutorial on the R package data.table, including advanced munging and modeling with a table that has ~13 million rows from my Macbook Air!

I'm reading up on algorithmic ethnic bias in recidivism predictions, machine learning's nefarious impact on inequality and big data and social justice. Looking for a way to get involved, so watch this space!






For more examples of data analysis, please see my