Papers
arxiv:2306.11644

Textbooks Are All You Need

Published on Jun 20, 2023
· Featured in Daily Papers on Jun 21, 2023
Authors:
,
,
,
,
,
,
,
,
,
,

Abstract

We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality" data from the web (6B tokens) and synthetically generated textbooks and exercises with GPT-3.5 (1B tokens). Despite this small scale, phi-1 attains pass@1 accuracy 50.6% on HumanEval and 55.5% on MBPP. It also displays surprising emergent properties compared to phi-1-base, our model before our finetuning stage on a dataset of coding exercises, and phi-1-small, a smaller model with 350M parameters trained with the same pipeline as phi-1 that still achieves 45% on HumanEval.

Community

will the source code with dataset be published?

Any chance of the models being published?

This comment has been hidden

I have been saying this from the start. I don't know why we were training LLMs on garbage conversations from humans.
We could train LLMs to be experts in kung fu and then install it in our brain with a 'neuralink' like the Matrix.

just get every book off of z library (including academic papers) and shove that through a NLP model

Very interesting project, still wonder will the source code and dataset be public?

see the model being deployed as azure openAI service. don't think it will be public.

This idea has been around for a long time. I think the paper should have cited this: https://youtu.be/WnTKllDbu5o?t=41

@sanchann Is this available? Why do they say "releases" when it is not available anywhere?

Is Code or model weights released anywhere? I could not find it on Internet.

Found this dataset that is inspired by the paper, but it is not clear how it was created:
https://huggingface.co/datasets/nampdn-ai/tiny-codes

teleprint-me/phi-1 , a little bit snippet from the Phi-1.

HF code and dataset?

Where can we get the synthetic text book datasets that used to train phi-1

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2306.11644 in a model README.md to link it from this page.

Datasets citing this paper 11

Browse 11 datasets citing this paper

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2306.11644 in a Space README.md to link it from this page.

Collections including this paper 26