--- license: wtfpl datasets: - cakiki/rosetta-code language: - en metrics: - accuracy library_name: transformers pipeline_tag: text-classification tags: - code - programming-language - code-classification base_model: huggingface/CodeBERTa-small-v1 --- This Model is a fine-tuned version of *huggingface/CodeBERTa-small-v1* on *cakiki/rosetta-code* Dataset for 25 Programming Languages as mentioned below. ## Training Details: Model is trained for 25 epochs on Azure for nearly 26000 Datapoints for above Mentioned 25 Programming Languages
extracted from Dataset having 1006 of total Programming Language. ### Programming Languages this model is able to detect vs Examples used for training
  1. 'ARM Assembly'
  2. 'AppleScript'
  3. 'C'
  4. 'C#'
  5. 'C++'
  6. 'COBOL'
  7. 'Erlang'
  8. 'Fortran'
  9. 'Go'
  10. 'Java'
  11. 'JavaScript'
  12. 'Kotlin'
  13. 'Lua
  14. 'Mathematica/Wolfram Language'
  15. 'PHP'
  16. 'Pascal'
  17. 'Perl'
  18. 'PowerShell'
  19. 'Python'
  20. 'R
  21. 'Ruby'
  22. 'Rust'
  23. 'Scala'
  24. 'Swift'
  25. 'Visual Basic .NET'
  26. 'jq'

## Below is the Training Result for 25 epochs. Training Computer Configuration: GPU:1xNvidia Tesla T4, VRam: 16GB, Ram:112GB,Cores:6 Cores Training Time taken: exactly 7 hours for 25 epochs Training Hyper-parameters: ![image/png](https://cdn-uploads.huggingface.co/production/uploads/645c859ad90782b1a6a3e957/yRqjKVFKZIT_zXjcA3yFW.png) ![training detail.png](https://cdn-uploads.huggingface.co/production/uploads/645c859ad90782b1a6a3e957/Oi9TuJ8nEjtt6Z_W56myn.png)