The Data Den

Projects on Math + Visualization + Machine Learning
by Alexandru Papiu

Cross Validation Error Pitfalls

Let’s say you have 10 models that you’d want to test and roughly all models have the same cross validation error distribution: the Cross Validation Mean Squared Error is normally distributed with mean = 3 and standard deviation equal to .2. Since CV error is an average of a bunch... [Read More]

Singlehood in America - a look at the ACS Census

I will be looking at data from the American Community Survey and try to find patterns in the way Americans date, marry and divorce. This is still work in progress but I figured I’d add some plots and maps to begin with. <a href="" target="_blank" title="Marital Status by Age"... [Read More]