metadata
license: apache-2.0
tags:
- Token Classification
widget:
- text: "The following is a bubble sort implementation taken from TeamTest57/Whack-A-Mole on github. int iro = 0; int score = 0; void bubble_sort() {\n\tint i, j;\n\tfor (i = 0; i < mole_num - 1; i++)\n\t\tfor (j = mole_num - 1; j >= i + 1; j--)\n\t\t\tif (hole_y[j] < hole_y[j - 1]) {\n\t\t\t\tint temp;\n\t\t\t\ttemp = hole_y[j];\n\t\t\t\thole_y[j] = hole_y[j - 1];\n\t\t\t\thole_y[j - 1] = temp;\n\t\t\t\ttemp = hole_x[j];\n\t\t\t\thole_x[j] = hole_x[j - 1];\n\t\t\t\thole_x[j - 1] = temp;\n\t\t\t}\n}"
example_title: example 1
- text: >-
# Sample animal inherits from custom metaclass class
Panda(metaclass=CustomMeta):
"""I bet you see this docstring printed as well"""
fav_food = "Bamboo"
loves_code = True
def activity(self):
print("Zzz...")
This programming code was taken from cyberpanda/PythonStuff on GitHub and
is cc0-licensed. It defines a class with member variables and methods.
example_title: example 2
This is a distilbert-base-multilingual-cased-Model fine-tuned with a NER objective to tag tokens based on whether they belong to a code block or natural language text. The dataset of 78210 examples was generated by randomly combining code and text blocks from other permissively-licensed datasets, with some examples containing only code and some only regular text.
The model achieves the following stats on the validation set:
Metric | Value |
---|---|
Loss | 0.0788 |
F1 Score | 0.8619 |
Precision | 0.8362 |
Recall | 0.8893 |
Accuracy | 0.9792 |