Do do-do do do-do, do-do do-do do-do da-do.
Let’s do the time warp again.
Where in Ireland can you see the most counties at once?
Optimise a function with probabilities.
A fun game to play in the car.
Drinking your way around Dublin.
A fun game for all the family.
Are songs getting more repetitive?
A big map of many bands.
Or maybe, more accurately, Functional Programming by a data scientist.
Teanga ríomhchlárúcháin as Gaeilge.
What’s in a norm?
Sometimes you’ve gotta go where everybody knows your name.
Summertime is coming so I’ve built the best road-trip of Ireland.
In this post I want to wrap up the short series on GPs by discussing the common GP packages in Python, and showing how to model some real data.
In this post I want to take a deeper look at what a Gaussian Process is doing under hood.
Gaussian Processes (GPs) are a really flexible, really powerful class of models that have seen a lot of use in the last fifteen or twenty years. In this post I’d like to go through the background to these models, what they are, and how they work; primarily so I can try straighten out some of these ideas in my own head.
With the 2018 Premier League season coming to an end I thought it would be a fun to take a quick look at some historic PL data.
When I first moved to London I spent a lot of my time being lost. It’s a big place, and since I was on the Tube a lot I found it tough to get a good intuition for where things are overground. So I did what I normally do when I don’t understand something, I got some data and I did some analysis.
Joy Plots are a really handy way to get a quick, qualitative impression of a data set. They also look really cool. So I decided to put together a Python package to make them.
In classification problems there are a lot of metrics that you can look at, but they all have blind-spots. It can be really easy to build a model that scores well on some metric but ends up not being useful to the problem you’re trying to solve. When you see a metric for some model it’s useful to bear in mind what this metric measures, and equally importantly, what it doesn’t.
They say not to judge a book by its cover, but what about an album? What does the cover tell you about the contents, can you reliably figure out the genre of an album just by looking at it? My feeling is that you could, to some extent. If you showed me an album cover, out of context, I feel like I could take a decent stab at the kind of music that’s on it. So I wrote some code to try to quantify this.
Making a histogram out of some data is one of those things that sounds really straightforward at first and gets progressively more complicated the more you think about it. Defining things clearly, lets say we have a vector, $x$, of $n$ observations, and we want to break up the number line into a series of buckets and count up how many $x$’s fall in each one. How would we do that?