PawanKrGunjan commited on
Commit
4fafe25
1 Parent(s): 21bc1b8

End of training

Browse files
Files changed (3) hide show
  1. README.md +55 -109
  2. generation_config.json +0 -1
  3. model.safetensors +1 -1
README.md CHANGED
@@ -1,134 +1,80 @@
1
  ---
2
  base_model: microsoft/trocr-base-handwritten
3
  tags:
4
- - trocr
5
- - image-to-text
6
- - license-plate-number
7
  model-index:
8
  - name: license_plate_recognizer
9
- results:
10
- - task:
11
- type: image-to-text
12
- name: License Plate Recognition
13
- dataset:
14
- type: custom_dataset
15
- name: Custom License Plate Dataset
16
- config: default
17
- split: test
18
- revision: main
19
- metrics:
20
- - type: cer
21
- value: 0.0231
22
- name: Test CER
23
- config: default
24
- args:
25
- max_order: 4
26
- source:
27
- name: Hugging Face Model Card
28
- url: https://huggingface.co/PawanKrGunjan/license_plate_recognizer
29
- license: mit
30
- language:
31
- - en
32
- metrics:
33
- - cer
34
- library_name: transformers
35
- pipeline_tag: image-to-text
36
- datasets:
37
- - charliexu07/license_plates
38
  ---
 
39
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
40
  should probably proofread and complete it, then remove this comment. -->
41
 
42
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pawankrgunjan/huggingface/runs/v5cu1qdh)
43
-
44
-
45
  # license_plate_recognizer
46
 
47
- This model is a fine-tuned version of [microsoft/trocr-base-handwritten](https://huggingface.co/microsoft/trocr-base-handwritten) specifically tailored for recognizing license plate numbers from images. The fine-tuning process has been optimized to accurately decode alphanumeric characters typically found on license plates.
 
 
 
48
 
49
- ## Model Description
50
 
51
- The base model, `microsoft/trocr-base-handwritten`, is a Transformer-based OCR model designed for recognizing handwritten text. This fine-tuned version is adapted for license plate recognition, enhancing its ability to read and transcribe license plates from various sources, including images captured under different lighting and angles.
52
 
53
- ## Intended Uses & Limitations
54
 
55
- ### Intended Uses
56
- - **License Plate Recognition:** This model is designed to extract and transcribe alphanumeric characters from images of license plates. It can be used in various applications such as automated toll systems, parking management, and law enforcement.
57
 
58
- ### Limitations
59
- - **Character Set:** The model is optimized for the specific alphanumeric characters commonly found on license plates. It may not perform well on text outside this domain.
60
- - **Environmental Factors:** While robust to typical variations in image quality, extreme conditions like very low light, heavy blurring, or unusual angles may reduce accuracy.
61
 
62
- ## Training and Evaluation Data
63
 
64
- The model was fine-tuned on a dataset consisting of license plate images. The dataset includes a diverse set of license plates captured in various environments and lighting conditions, ensuring robustness in real-world applications. However, specific details about the dataset (e.g., size, source) are not provided here.
65
 
66
- ## Training Procedure
67
-
68
- ### Training Hyperparameters
69
 
70
  The following hyperparameters were used during training:
71
- - **learning_rate:** 2e-05
72
- - **train_batch_size:** 8
73
- - **eval_batch_size:** 8
74
- - **seed:** 42
75
- - **optimizer:** Adam with betas=(0.9, 0.999) and epsilon=1e-08
76
- - **lr_scheduler_type:** linear
77
- - **num_epochs:** 7
78
 
79
- ### Training Results
80
 
81
  | Training Loss | Epoch | Step | Validation Loss | Cer |
82
  |:-------------:|:-----:|:----:|:---------------:|:------:|
83
- | 0.2605 | 1.0 | 254 | 0.0798 | 0.0253 |
84
- | 0.138 | 2.0 | 508 | 0.0660 | 0.0177 |
85
- | 0.0435 | 3.0 | 762 | 0.0645 | 0.0146 |
86
- | 0.0344 | 4.0 | 1016 | 0.0594 | 0.0173 |
87
- | 0.011 | 5.0 | 1270 | 0.0626 | 0.0160 |
88
- | 0.0021 | 6.0 | 1524 | 0.0567 | 0.0120 |
89
- | 0.0007 | 7.0 | 1778 | 0.0599 | 0.0137 |
90
-
91
- ### Final Evaluation Metrics
92
- - **Loss:** 0.0653
93
- - **Cer:** 0.0231
94
-
95
- Certainly! Here’s the updated "How to Use the Model" section with the correct username:
96
-
97
-
98
- ## How to Use the Model
99
-
100
- Here is how you can use this fine-tuned model in PyTorch to recognize license plate numbers:
101
-
102
- ```python
103
- from transformers import TrOCRProcessor, VisionEncoderDecoderModel
104
- from PIL import Image
105
- import requests
106
-
107
- # Load an image of a license plate
108
- url = 'https://example.com/path/to/license_plate_image.jpg'
109
- image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
110
-
111
- # Initialize the processor and the fine-tuned model
112
- processor = TrOCRProcessor.from_pretrained('PawanKrGunjan/license_plate_recognizer')
113
- model = VisionEncoderDecoderModel.from_pretrained('PawanKrGunjan/license_plate_recognizer')
114
-
115
- # Preprocess the image
116
- pixel_values = processor(images=image, return_tensors="pt").pixel_values
117
-
118
- # Generate text (license plate number)
119
- generated_ids = model.generate(pixel_values)
120
- generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
121
-
122
- print("Recognized License Plate Number:", generated_text)
123
- ```
124
-
125
- In this example:
126
- 1. Replace the `url` with the actual URL of an image containing a license plate.
127
- 2. The model and processor are loaded from your fine-tuned model on the Hugging Face Hub (`PawanKrGunjan/license_plate_recognizer`).
128
-
129
- ## Framework Versions
130
-
131
- - **Transformers:** 4.42.3
132
- - **Pytorch:** 2.1.2
133
- - **Datasets:** 2.20.0
134
- - **Tokenizers:** 0.19.1
 
1
  ---
2
  base_model: microsoft/trocr-base-handwritten
3
  tags:
4
+ - generated_from_trainer
 
 
5
  model-index:
6
  - name: license_plate_recognizer
7
+ results: []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
+
10
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
11
  should probably proofread and complete it, then remove this comment. -->
12
 
13
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pawankrgunjan/huggingface/runs/ajvl0e6b)
 
 
14
  # license_plate_recognizer
15
 
16
+ This model is a fine-tuned version of [microsoft/trocr-base-handwritten](https://huggingface.co/microsoft/trocr-base-handwritten) on an unknown dataset.
17
+ It achieves the following results on the evaluation set:
18
+ - Loss: 0.0097
19
+ - Cer: 0.0036
20
 
21
+ ## Model description
22
 
23
+ More information needed
24
 
25
+ ## Intended uses & limitations
26
 
27
+ More information needed
 
28
 
29
+ ## Training and evaluation data
 
 
30
 
31
+ More information needed
32
 
33
+ ## Training procedure
34
 
35
+ ### Training hyperparameters
 
 
36
 
37
  The following hyperparameters were used during training:
38
+ - learning_rate: 2e-05
39
+ - train_batch_size: 8
40
+ - eval_batch_size: 8
41
+ - seed: 42
42
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
+ - lr_scheduler_type: linear
44
+ - num_epochs: 23
45
 
46
+ ### Training results
47
 
48
  | Training Loss | Epoch | Step | Validation Loss | Cer |
49
  |:-------------:|:-----:|:----:|:---------------:|:------:|
50
+ | 0.1485 | 1.0 | 397 | 0.0528 | 0.0182 |
51
+ | 0.0843 | 2.0 | 794 | 0.0371 | 0.0089 |
52
+ | 0.0552 | 3.0 | 1191 | 0.0417 | 0.0129 |
53
+ | 0.0812 | 4.0 | 1588 | 0.0386 | 0.0115 |
54
+ | 0.0315 | 5.0 | 1985 | 0.0198 | 0.0053 |
55
+ | 0.0178 | 6.0 | 2382 | 0.0263 | 0.0084 |
56
+ | 0.0341 | 7.0 | 2779 | 0.0179 | 0.0067 |
57
+ | 0.0143 | 8.0 | 3176 | 0.0149 | 0.0080 |
58
+ | 0.0047 | 9.0 | 3573 | 0.0055 | 0.0027 |
59
+ | 0.0163 | 10.0 | 3970 | 0.0062 | 0.0022 |
60
+ | 0.0045 | 11.0 | 4367 | 0.0049 | 0.0027 |
61
+ | 0.0115 | 12.0 | 4764 | 0.0077 | 0.0053 |
62
+ | 0.0014 | 13.0 | 5161 | 0.0031 | 0.0022 |
63
+ | 0.0081 | 14.0 | 5558 | 0.0052 | 0.0031 |
64
+ | 0.0001 | 15.0 | 5955 | 0.0056 | 0.0035 |
65
+ | 0.0005 | 16.0 | 6352 | 0.0057 | 0.0027 |
66
+ | 0.0009 | 17.0 | 6749 | 0.0053 | 0.0022 |
67
+ | 0.0003 | 18.0 | 7146 | 0.0067 | 0.0027 |
68
+ | 0.0001 | 19.0 | 7543 | 0.0044 | 0.0018 |
69
+ | 0.0001 | 20.0 | 7940 | 0.0052 | 0.0018 |
70
+ | 0.0 | 21.0 | 8337 | 0.0050 | 0.0018 |
71
+ | 0.0 | 22.0 | 8734 | 0.0051 | 0.0018 |
72
+ | 0.0 | 23.0 | 9131 | 0.0051 | 0.0018 |
73
+
74
+
75
+ ### Framework versions
76
+
77
+ - Transformers 4.42.3
78
+ - Pytorch 2.1.2
79
+ - Datasets 2.20.0
80
+ - Tokenizers 0.19.1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
generation_config.json CHANGED
@@ -3,7 +3,6 @@
3
  "decoder_start_token_id": 0,
4
  "early_stopping": true,
5
  "eos_token_id": 2,
6
- "max_length": 128,
7
  "num_beams": 3,
8
  "pad_token_id": 1,
9
  "transformers_version": "4.42.3",
 
3
  "decoder_start_token_id": 0,
4
  "early_stopping": true,
5
  "eos_token_id": 2,
 
6
  "num_beams": 3,
7
  "pad_token_id": 1,
8
  "transformers_version": "4.42.3",
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b70831db27e76688d7618656a4d9f252cefeef4853b96ca3587219a2a1804f2d
3
  size 1335747032
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5161a5497ff8a87587c30bf6b64cdb82087efb6b49475f319939794bebd34f00
3
  size 1335747032