Matthew Kenney PRO

matthewkenney

AI & ML interests

All of my models can be found at ArtifactAI

Organizations

Posts 1

view post
Post
1608
Introducing ArtifactAI/arxiv_deep_learning_python_research_code_functions_summaries

contains 778,152 summaries for every deep learning python function and class extracted from repos referenced in ArXiv papers:

ArtifactAI/arxiv_deep_learning_python_research_code_functions_summaries

34,099 active GitHub repository names were extracted from [ArXiv](https://arxiv.org) papers from its inception through July 21st, 2023 totaling 42GB of compressed github repositories.

These repositories were then filtered for deep learning python code, and functions and classes were extracted. Summaries are generated for each function and class using Google Gemma 7B ( google/gemma-7b).

Fields:
- prompt (string): prompt used to generate the summary.
- function (string): function or class to summarize.
- function_name (string): name of the function or class.
- file_number: (integer): file number.
- tok_prompt: (float): formatted prompt used to generate the summary.
- function_summary: (integer): summary response from the model.
- function_summary_clean: (string): cleaned summary response from the model.
- repo: (string): repo from which the function was extracted.
- file: (string): name of the file.
- full_code: (string): code from the file in which function exists.
- file_length: (int): character length of full_code.
- avg_line_length: (int): average line length of full_code.
- max_line_length: (int): maximum line length of full_code.
- extension_type: (string): file extension (.py).

models

None public yet

datasets

None public yet