The MNIST dataset is the bread and butter of deep learning. Featuring 70,000 handwritten, numerical digits partitioned into a training and testing set, the dataset is the go to candidate for a large proportion of introductory tutorials, benchmarking tests, and data science showcases. This post questions the suitablility of this dataset for such uses, attributing this shortcoming to the excessive simplicity of the challenge it presents when tackled with modern machine learning tools. Additionally, we look at alternatives to the dataset that demonstrate a more appropriate challenge without fundamentally changing the learning problem.
Microsoft Excel is a powerful tool for solving problems in many disiplines. It does, however, have its limitations; there are simply some problems that should not be solved in Excel. This is the first post in a new series in which we ignore these shortcomings and push Excel to its limits to solve problems that it was never meant for.
Timezones are strange things. Be it Chatham Island's 45-minute offset or West Bank's ethnically divided use of daylight saving, it almost seems like the timezones of the world were chosen to baffle. In this post we ask which capital city has a timezone that differs the most from what would be expected given its longtitude. Any guesses?
2020 is here and one of my goals for the coming year is to finally get caught up on the XKCD comic series. Starting from the beginning is a dull way of doing things so instead I've taken advantage of Google Cloud Platform's Cloud Scheduler to setup a python script to email me a random selection of new comics each day. In this post I will share how you can do the same.
After yesterday's post drawing Christmas trees with Python, it's time to give R a chance to shine. In this post, I use the shiny and ggvis packages to build a webapp for generating parametric snowflakes.
Christmas is here but that's no excuse to stop coding. In the second installment of the bank holiday bodge series, there will be a major change in format but the principle will stay the same—showcasing a rough piece of work brought to fruition in a single day. This post will concern the use of parametric equations and the animation module from matplotlib to generate your own ornamented Christmas tree animation
Don't waste your tin foil wrapping up your turkey when fashioning it into a hat is far more in need this Christmas. In this post I will reveal how advent calendar makers the world over, have been lying to you about. So launch your VPN and delete your cookies, we're about to take down Big Advent.
Reinforcement learning is a current hot topic in the world of data science. In this post, we look at how concepts from this area, in particular effective policies for the multi-armed bandit problem, can be applied to a job application assessment ran by pymetrics.
ggplot2 is an amazing tool for building beautiful visualisations using a simple and coherent grammar—that is, when it wants to play nice. Sadly, this is not always the case and one can find themselves developing strange workarounds to overcome the limitations of the package. This post discusses one of these approaches, used to facilitate the correct ordering of factors within a faceted plot.