in Link Post

Lessons Learned Reproducing a Deep Reinforcement Learning Paper


Matthew Rahtz:

I’ve seen a few recommendations that reproducing papers is a good way of levelling up machine learning skills, and I decided this could be an interesting one to try with. It was indeed a super fun project, and I’m happy to have tackled it – but looking back, I realise it wasn’t exactly the experience I thought it would be.

This whole post got me really excited, despite the difficulties mentioned. It also took me back to the mindset I needed during my PhD. When it takes a couple of hours to figure out if something worked, you need to spend a lot more time considering what that something should be. Reimplamenting someone else’s paper can be a really great exercise. You get really intimate knowledge of how the system works. You level up your skills. Most importantly: you get ideas for what to try next.

Reading the post, I suspect the first thing I would do differently is aim lower than he did:

The first surprise was in terms of calendar time. My original estimate was that as a side project it would take about 3 months. It actually took around 8 months. (And the original estimate was supposed to be pessimistic!) Some of that was down to underestimating how many hours each stage would take, but a big chunk of the underestimate was failing to anticipate other things coming up outside the project. It’s hard to say how well this generalises, but for side projects, taking your original (already pessimistic) time estimates and doubling them might not be a bad rule-of-thumb.

That’s more or less the standard rule of thumb for all engineering projects, in my experience. Even so, eight months is a long time to spend on a side project. Like I said above: I’ll be aiming lower. My plan of attack starts with the 2018 and version of’s “Cutting Edge Deep Learning for Coders” when it’s released later this year.