Textbooks Are All You Need

Published on Jun 20, 2023
Β· Featured in Daily Papers on Jun 21, 2023


We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality" data from the web (6B tokens) and synthetically generated textbooks and exercises with GPT-3.5 (1B tokens). Despite this small scale, phi-1 attains pass@1 accuracy 50.6% on HumanEval and 55.5% on MBPP. It also displays surprising emergent properties compared to phi-1-base, our model before our finetuning stage on a dataset of coding exercises, and phi-1-small, a smaller model with 350M parameters trained with the same pipeline as phi-1 that still achieves 45% on HumanEval.


will the source code with dataset be published?

Any chance of the models being published?

This comment has been hidden

I have been saying this from the start. I don't know why we were training LLMs on garbage conversations from humans.
We could train LLMs to be experts in kung fu and then install it in our brain with a 'neuralink' like the Matrix.

just get every book off of z library (including academic papers) and shove that through a NLP model

Very interesting project, still wonder will the source code and dataset be public?

see the model being deployed as azure openAI service. don't think it will be public.

This idea has been around for a long time. I think the paper should have cited this:

@sanchann Is this available? Why do they say "releases" when it is not available anywhere?

Is Code or model weights released anywhere? I could not find it on Internet.

Found this dataset that is inspired by the paper, but it is not clear how it was created:

teleprint-me/phi-1 , a little bit snippet from the Phi-1.

HF code and dataset?

Where can we get the synthetic text book datasets that used to train phi-1

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 12

Browse 12 datasets citing this paper

Spaces citing this paper 0

No Space linking this paper

Cite in a Space to link it from this page.

Collections including this paper 26