ljcnju
/

CodeBertForCodeTrans

@@ -13,14 +13,12 @@ should probably proofread and complete it, then remove this comment. -->
 # CodeBertForCodeTrans
 This model is a fine-tuned version of [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base) on an unknown dataset.
 ## Model description
-This model is fine-tuned on CodeXGLUE codetrans dataset. It can only translate java code to c-sharp code.
-Prompt:
-```python
-"#translate this java code to c-sharp code:\njava:<Your java code>"
-```
 ## Intended uses & limitations
@@ -35,24 +33,45 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 1e-05
 - train_batch_size: 16
 - eval_batch_size: 16
 - seed: 42
-- gradient_accumulation_steps: 2
-- total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 12354.0
 - num_epochs: 20
 ### Training results
 ### Framework versions
-- Transformers 4.33.2
-- Pytorch 2.0.1+cu117
-- Datasets 2.16.1
-- Tokenizers 0.13.3

 # CodeBertForCodeTrans
 This model is a fine-tuned version of [microsoft/codebert-base](https://huggingface.co/microsoft/codebert-base) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.0006
 ## Model description
+More information needed
 ## Intended uses & limitations
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 5e-05
 - train_batch_size: 16
 - eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 12354.0
 - num_epochs: 20
+- mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch | Step  | Validation Loss |
+|:-------------:|:-----:|:-----:|:---------------:|
+| 5.7169        | 1.0   | 644   | 4.5075          |
+| 3.0571        | 2.0   | 1288  | 2.1423          |
+| 0.7391        | 3.0   | 1932  | 0.2866          |
+| 0.1028        | 4.0   | 2576  | 0.0219          |
+| 0.0158        | 5.0   | 3220  | 0.0047          |
+| 0.0065        | 6.0   | 3864  | 0.0024          |
+| 0.0036        | 7.0   | 4508  | 0.0020          |
+| 0.0028        | 8.0   | 5152  | 0.0014          |
+| 0.0018        | 9.0   | 5796  | 0.0010          |
+| 0.0023        | 10.0  | 6440  | 0.0017          |
+| 0.002         | 11.0  | 7084  | 0.0009          |
+| 0.002         | 12.0  | 7728  | 0.0012          |
+| 0.0015        | 13.0  | 8372  | 0.0020          |
+| 0.0028        | 14.0  | 9016  | 0.0010          |
+| 0.0015        | 15.0  | 9660  | 0.0007          |
+| 0.0027        | 16.0  | 10304 | 0.0015          |
+| 0.002         | 17.0  | 10948 | 0.0007          |
+| 0.0011        | 18.0  | 11592 | 0.0009          |
+| 0.0019        | 19.0  | 12236 | 0.0007          |
+| 0.0003        | 20.0  | 12880 | 0.0006          |
 ### Framework versions
+- Transformers 4.37.2
+- Pytorch 2.1.2+cu121
+- Datasets 2.15.0
+- Tokenizers 0.15.0

generation_config.json CHANGED Viewed

@@ -3,5 +3,5 @@
   "bos_token_id": 0,
   "eos_token_id": 2,
   "pad_token_id": 1,
-  "transformers_version": "4.33.2"
 }

   "bos_token_id": 0,
   "eos_token_id": 2,
   "pad_token_id": 1,
+  "transformers_version": "4.37.2"
 }