NamCyan commited on
Commit
5d56c2d
1 Parent(s): e9a7954

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -102,17 +102,17 @@ Using model with Jax and Pytorch
102
  ```python
103
  from transformers import AutoTokenizer, AutoModelForSequenceClassification, FlaxAutoModelForSequenceClassification
104
 
105
- #Load jax model
106
  model = FlaxAutoModelForSequenceClassification.from_pretrained("Fsoft-AIC/Codebert-docstring-inconsistency")
107
 
108
- #Load torch model
109
  model = AutoModelForSequenceClassification.from_pretrained("Fsoft-AIC/Codebert-docstring-inconsistency")
110
  ```
111
 
112
  ## Limitations
113
- This model is trained on a subset of 5M data in The Vault in the self-supervised manner. Since the negative samples are generated artificially, the model's ability to identify instances that require a strong semantic understanding between the code and the docstring might be restricted.
114
 
115
- It is hard to evaluate the model due to the unavailable labeled datasets. ChatGPT is adopted as a reference to measure the correlation between the model and ChatGPT's scores. However, the result could be influenced by ChatGPT's potential biases and ambiguous conditions. Therefore, we recommend having human labeling dataset and finetune this model to achieve the best result.
116
 
117
  ## Additional information
118
  ### Licensing Information
 
102
  ```python
103
  from transformers import AutoTokenizer, AutoModelForSequenceClassification, FlaxAutoModelForSequenceClassification
104
 
105
+ #Load model with jax
106
  model = FlaxAutoModelForSequenceClassification.from_pretrained("Fsoft-AIC/Codebert-docstring-inconsistency")
107
 
108
+ #Load model with torch
109
  model = AutoModelForSequenceClassification.from_pretrained("Fsoft-AIC/Codebert-docstring-inconsistency")
110
  ```
111
 
112
  ## Limitations
113
+ This model is trained on 5M subset of The Vault in a self-supervised manner. Since the negative samples are generated artificially, the model's ability to identify instances that require a strong semantic understanding between the code and the docstring might be restricted.
114
 
115
+ It is hard to evaluate the model due to the unavailable labeled datasets. ChatGPT is adopted as a reference to measure the correlation between the model and ChatGPT's scores. However, the result could be influenced by ChatGPT's potential biases and ambiguous conditions. Therefore, we recommend having human labeling dataset and fine-tune this model to achieve the best result.
116
 
117
  ## Additional information
118
  ### Licensing Information