We spend a lot of time training models that can barely fit 1-4 samples/GPU. But SGD usually needs more than few samples/batch for decent results. Here is a post gathering practical tips we use, from simple tricks to multi-GPU code & distributed setups.
How you can make your Python NLP module 50-100 times faster by use spaCy's internals and a bit of Cython magic! Comes with a Jupyter notebook with examples processing over 80 millions words per sec!
A post summarizing recent developments in Universal Word/Sentence Embeddings that happened over 2017/early-2018 and future trends. With ELMo, InferSent, Google's Universal Sentence embeddings, learning by multi-tasking...
To introduce the work we presented at ICLR 2018, we drafted a visual & intuitive introduction to Meta-Learning. In this post, we start by explaining whatโs meta-learning in a very visual and intuitive way. Then, we code a meta-learning model in PyTorch and share some of the lessons learned on this project.