Alphabet-Sign-Language-Detection

deekshitha11

prithivMLmods commited on 16 days ago

Commit

caebd94

0 Parent(s):

Duplicate from prithivMLmods/Alphabet-Sign-Language-Detection

Browse files

Co-authored-by: Prithiv Sakthi <prithivMLmods@users.noreply.huggingface.co>

Files changed (14) hide show

.gitattributes +35 -0
README.md +142 -0
checkpoint-3806/config.json +94 -0
checkpoint-3806/model.safetensors +3 -0
checkpoint-3806/optimizer.pt +3 -0
checkpoint-3806/preprocessor_config.json +24 -0
checkpoint-3806/rng_state.pth +3 -0
checkpoint-3806/scheduler.pt +3 -0
checkpoint-3806/trainer_state.json +93 -0
checkpoint-3806/training_args.bin +3 -0
config.json +94 -0
model.safetensors +3 -0
preprocessor_config.json +24 -0
training_args.bin +3 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,35 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,142 @@

+---
+license: apache-2.0
+language:
+- en
+base_model:
+- google/siglip2-base-patch16-224
+pipeline_tag: image-classification
+library_name: transformers
+tags:
+- sign-language-detection
+- alphabet
+---
+![dzfgdf.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/gFcXjzt_OA-46WpFfz-9L.png)
+# **Alphabet-Sign-Language-Detection**
+> **Alphabet-Sign-Language-Detection** is an image classification vision-language encoder model fine-tuned from **google/siglip2-base-patch16-224** for a single-label classification task. It is designed to classify images into **sign language alphabet** categories using the **SiglipForImageClassification** architecture.
+```py
+Classification Report:
+              precision    recall  f1-score   support
+           A     0.9995    1.0000    0.9998      4384
+           B     1.0000    1.0000    1.0000      4441
+           C     1.0000    1.0000    1.0000      3993
+           D     1.0000    0.9998    0.9999      4940
+           E     1.0000    1.0000    1.0000      4658
+           F     1.0000    1.0000    1.0000      5750
+           G     0.9992    0.9996    0.9994      4978
+           H     1.0000    0.9979    0.9990      4807
+           I     0.9992    1.0000    0.9996      4856
+           J     1.0000    0.9996    0.9998      5227
+           K     0.9972    1.0000    0.9986      5426
+           L     1.0000    0.9998    0.9999      5089
+           M     1.0000    0.9964    0.9982      3328
+           N     0.9955    1.0000    0.9977      2635
+           O     0.9998    1.0000    0.9999      4564
+           P     1.0000    0.9993    0.9996      4100
+           Q     1.0000    1.0000    1.0000      4187
+           R     0.9998    0.9984    0.9991      5122
+           S     0.9998    0.9998    0.9998      5147
+           T     1.0000    1.0000    1.0000      4722
+           U     0.9984    0.9998    0.9991      5041
+           V     1.0000    0.9984    0.9992      5116
+           W     0.9998    1.0000    0.9999      4926
+           X     1.0000    0.9995    0.9998      4387
+           Y     1.0000    1.0000    1.0000      5185
+           Z     0.9996    1.0000    0.9998      4760
+    accuracy                         0.9996    121769
+   macro avg     0.9995    0.9996    0.9995    121769
+weighted avg     0.9996    0.9996    0.9996    121769
+```
+![demo.png](https://cdn-uploads.huggingface.co/production/uploads/65bb837dbfb878f46c77de4c/AVpi4xPsVq6PV9NzonHoi.png)
+The model categorizes images into the following 26 classes:
+- **Class 0:** "A"
+- **Class 1:** "B"
+- **Class 2:** "C"
+- **Class 3:** "D"
+- **Class 4:** "E"
+- **Class 5:** "F"
+- **Class 6:** "G"
+- **Class 7:** "H"
+- **Class 8:** "I"
+- **Class 9:** "J"
+- **Class 10:** "K"
+- **Class 11:** "L"
+- **Class 12:** "M"
+- **Class 13:** "N"
+- **Class 14:** "O"
+- **Class 15:** "P"
+- **Class 16:** "Q"
+- **Class 17:** "R"
+- **Class 18:** "S"
+- **Class 19:** "T"
+- **Class 20:** "U"
+- **Class 21:** "V"
+- **Class 22:** "W"
+- **Class 23:** "X"
+- **Class 24:** "Y"
+- **Class 25:** "Z"
+# **Run with Transformers🤗**
+```python
+!pip install -q transformers torch pillow gradio
+```
+```python
+import gradio as gr
+from transformers import AutoImageProcessor
+from transformers import SiglipForImageClassification
+from transformers.image_utils import load_image
+from PIL import Image
+import torch
+# Load model and processor
+model_name = "prithivMLmods/Alphabet-Sign-Language-Detection"
+model = SiglipForImageClassification.from_pretrained(model_name)
+processor = AutoImageProcessor.from_pretrained(model_name)
+def sign_language_classification(image):
+    """Predicts sign language alphabet category for an image."""
+    image = Image.fromarray(image).convert("RGB")
+    inputs = processor(images=image, return_tensors="pt")
+    with torch.no_grad():
+        outputs = model(**inputs)
+        logits = outputs.logits
+        probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
+    labels = {
+        "0": "A", "1": "B", "2": "C", "3": "D", "4": "E", "5": "F", "6": "G", "7": "H", "8": "I", "9": "J",
+        "10": "K", "11": "L", "12": "M", "13": "N", "14": "O", "15": "P", "16": "Q", "17": "R", "18": "S", "19": "T",
+        "20": "U", "21": "V", "22": "W", "23": "X", "24": "Y", "25": "Z"
+    }
+    predictions = {labels[str(i)]: round(probs[i], 3) for i in range(len(probs))}
+    return predictions
+# Create Gradio interface
+iface = gr.Interface(
+    fn=sign_language_classification,
+    inputs=gr.Image(type="numpy"),
+    outputs=gr.Label(label="Prediction Scores"),
+    title="Alphabet Sign Language Detection",
+    description="Upload an image to classify it into one of the 26 sign language alphabet categories."
+)
+# Launch the app
+if __name__ == "__main__":
+    iface.launch()
+```
+# **Intended Use:**
+The **Alphabet-Sign-Language-Detection** model is designed for sign language image classification. It helps categorize images of hand signs into predefined alphabet categories. Potential use cases include:
+- **Sign Language Education:** Assisting learners in recognizing and practicing sign language alphabets.
+- **Accessibility Enhancement:** Supporting applications that improve communication for the hearing impaired.
+- **AI Research:** Advancing computer vision models in sign language recognition.
+- **Gesture Recognition Systems:** Enabling interactive applications with real-time sign language detection.

checkpoint-3806/config.json ADDED Viewed

	@@ -0,0 +1,94 @@

+{
+  "architectures": [
+    "SiglipForImageClassification"
+  ],
+  "id2label": {
+    "0": "A",
+    "1": "B",
+    "2": "C",
+    "3": "D",
+    "4": "E",
+    "5": "F",
+    "6": "G",
+    "7": "H",
+    "8": "I",
+    "9": "J",
+    "10": "K",
+    "11": "L",
+    "12": "M",
+    "13": "N",
+    "14": "O",
+    "15": "P",
+    "16": "Q",
+    "17": "R",
+    "18": "S",
+    "19": "T",
+    "20": "U",
+    "21": "V",
+    "22": "W",
+    "23": "X",
+    "24": "Y",
+    "25": "Z"
+  },
+  "initializer_factor": 1.0,
+  "label2id": {
+    "A": 0,
+    "B": 1,
+    "C": 2,
+    "D": 3,
+    "E": 4,
+    "F": 5,
+    "G": 6,
+    "H": 7,
+    "I": 8,
+    "J": 9,
+    "K": 10,
+    "L": 11,
+    "M": 12,
+    "N": 13,
+    "O": 14,
+    "P": 15,
+    "Q": 16,
+    "R": 17,
+    "S": 18,
+    "T": 19,
+    "U": 20,
+    "V": 21,
+    "W": 22,
+    "X": 23,
+    "Y": 24,
+    "Z": 25
+  },
+  "model_type": "siglip",
+  "problem_type": "single_label_classification",
+  "text_config": {
+    "attention_dropout": 0.0,
+    "hidden_act": "gelu_pytorch_tanh",
+    "hidden_size": 768,
+    "intermediate_size": 3072,
+    "layer_norm_eps": 1e-06,
+    "max_position_embeddings": 64,
+    "model_type": "siglip_text_model",
+    "num_attention_heads": 12,
+    "num_hidden_layers": 12,
+    "projection_size": 768,
+    "torch_dtype": "float32",
+    "vocab_size": 256000
+  },
+  "torch_dtype": "float32",
+  "transformers_version": "4.51.0.dev0",
+  "vision_config": {
+    "attention_dropout": 0.0,
+    "hidden_act": "gelu_pytorch_tanh",
+    "hidden_size": 768,
+    "image_size": 224,
+    "intermediate_size": 3072,
+    "layer_norm_eps": 1e-06,
+    "model_type": "siglip_vision_model",
+    "num_attention_heads": 12,
+    "num_channels": 3,
+    "num_hidden_layers": 12,
+    "patch_size": 16,
+    "torch_dtype": "float32"
+  }
+}

checkpoint-3806/model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f4587facaf0b143a91070961f2c25fa4c99109cff99f178212e4408afe6c98be
+size 371641824

checkpoint-3806/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8c9fb67aa9ac490a53667f6057ab76e9ef0d2e0767932828ec725edb241b87bf
+size 686703354

checkpoint-3806/preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "do_convert_rgb": null,
+  "do_normalize": true,
+  "do_rescale": true,
+  "do_resize": true,
+  "image_mean": [
+    0.5,
+    0.5,
+    0.5
+  ],
+  "image_processor_type": "SiglipImageProcessor",
+  "image_std": [
+    0.5,
+    0.5,
+    0.5
+  ],
+  "processor_class": "SiglipProcessor",
+  "resample": 2,
+  "rescale_factor": 0.00392156862745098,
+  "size": {
+    "height": 224,
+    "width": 224
+  }
+}

checkpoint-3806/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ab2fc19926ae2fb56788f4cb06febb63c3388b74df520678d8fd466f8a18ab49
+size 14244

checkpoint-3806/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9b560ef76704b06b8ae457fb1452fa10b2f86ddf16745b55b61b88edb8899516
+size 1064

checkpoint-3806/trainer_state.json ADDED Viewed

	@@ -0,0 +1,93 @@

+{
+  "best_global_step": 3806,
+  "best_metric": 0.0015409403713420033,
+  "best_model_checkpoint": "Alphabetical-Sign-Language-Detection/checkpoint-3806",
+  "epoch": 1.0,
+  "eval_steps": 500,
+  "global_step": 3806,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.13137151865475566,
+      "grad_norm": 47.51639175415039,
+      "learning_rate": 1.7603833865814697e-06,
+      "loss": 1.4084,
+      "step": 500
+    },
+    {
+      "epoch": 0.2627430373095113,
+      "grad_norm": 36.14832305908203,
+      "learning_rate": 1.494142705005325e-06,
+      "loss": 0.0758,
+      "step": 1000
+    },
+    {
+      "epoch": 0.3941145559642669,
+      "grad_norm": 0.47897106409072876,
+      "learning_rate": 1.22790202342918e-06,
+      "loss": 0.0253,
+      "step": 1500
+    },
+    {
+      "epoch": 0.5254860746190226,
+      "grad_norm": 0.23500379920005798,
+      "learning_rate": 9.616613418530351e-07,
+      "loss": 0.0174,
+      "step": 2000
+    },
+    {
+      "epoch": 0.6568575932737782,
+      "grad_norm": 14.302626609802246,
+      "learning_rate": 6.954206602768902e-07,
+      "loss": 0.0099,
+      "step": 2500
+    },
+    {
+      "epoch": 0.7882291119285338,
+      "grad_norm": 0.023673132061958313,
+      "learning_rate": 4.2917997870074544e-07,
+      "loss": 0.0053,
+      "step": 3000
+    },
+    {
+      "epoch": 0.9196006305832896,
+      "grad_norm": 0.13073758780956268,
+      "learning_rate": 1.6293929712460063e-07,
+      "loss": 0.0038,
+      "step": 3500
+    },
+    {
+      "epoch": 1.0,
+      "eval_accuracy": 0.9995811741904754,
+      "eval_loss": 0.0015409403713420033,
+      "eval_model_preparation_time": 0.0024,
+      "eval_runtime": 1508.1295,
+      "eval_samples_per_second": 80.742,
+      "eval_steps_per_second": 10.093,
+      "step": 3806
+    }
+  ],
+  "logging_steps": 500,
+  "max_steps": 3806,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 1,
+  "save_steps": 500,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": true
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 1.0200852722126868e+19,
+  "train_batch_size": 32,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-3806/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:20111e84a551ea8aa40b7c13479cf0a63c273323121e66cd5b3dd0143f930ed7
+size 5304

config.json ADDED Viewed

	@@ -0,0 +1,94 @@

+{
+  "architectures": [
+    "SiglipForImageClassification"
+  ],
+  "id2label": {
+    "0": "A",
+    "1": "B",
+    "2": "C",
+    "3": "D",
+    "4": "E",
+    "5": "F",
+    "6": "G",
+    "7": "H",
+    "8": "I",
+    "9": "J",
+    "10": "K",
+    "11": "L",
+    "12": "M",
+    "13": "N",
+    "14": "O",
+    "15": "P",
+    "16": "Q",
+    "17": "R",
+    "18": "S",
+    "19": "T",
+    "20": "U",
+    "21": "V",
+    "22": "W",
+    "23": "X",
+    "24": "Y",
+    "25": "Z"
+  },
+  "initializer_factor": 1.0,
+  "label2id": {
+    "A": 0,
+    "B": 1,
+    "C": 2,
+    "D": 3,
+    "E": 4,
+    "F": 5,
+    "G": 6,
+    "H": 7,
+    "I": 8,
+    "J": 9,
+    "K": 10,
+    "L": 11,
+    "M": 12,
+    "N": 13,
+    "O": 14,
+    "P": 15,
+    "Q": 16,
+    "R": 17,
+    "S": 18,
+    "T": 19,
+    "U": 20,
+    "V": 21,
+    "W": 22,
+    "X": 23,
+    "Y": 24,
+    "Z": 25
+  },
+  "model_type": "siglip",
+  "problem_type": "single_label_classification",
+  "text_config": {
+    "attention_dropout": 0.0,
+    "hidden_act": "gelu_pytorch_tanh",
+    "hidden_size": 768,
+    "intermediate_size": 3072,
+    "layer_norm_eps": 1e-06,
+    "max_position_embeddings": 64,
+    "model_type": "siglip_text_model",
+    "num_attention_heads": 12,
+    "num_hidden_layers": 12,
+    "projection_size": 768,
+    "torch_dtype": "float32",
+    "vocab_size": 256000
+  },
+  "torch_dtype": "float32",
+  "transformers_version": "4.51.0.dev0",
+  "vision_config": {
+    "attention_dropout": 0.0,
+    "hidden_act": "gelu_pytorch_tanh",
+    "hidden_size": 768,
+    "image_size": 224,
+    "intermediate_size": 3072,
+    "layer_norm_eps": 1e-06,
+    "model_type": "siglip_vision_model",
+    "num_attention_heads": 12,
+    "num_channels": 3,
+    "num_hidden_layers": 12,
+    "patch_size": 16,
+    "torch_dtype": "float32"
+  }
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f4587facaf0b143a91070961f2c25fa4c99109cff99f178212e4408afe6c98be
+size 371641824

preprocessor_config.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "do_convert_rgb": null,
+  "do_normalize": true,
+  "do_rescale": true,
+  "do_resize": true,
+  "image_mean": [
+    0.5,
+    0.5,
+    0.5
+  ],
+  "image_processor_type": "SiglipImageProcessor",
+  "image_std": [
+    0.5,
+    0.5,
+    0.5
+  ],
+  "processor_class": "SiglipProcessor",
+  "resample": 2,
+  "rescale_factor": 0.00392156862745098,
+  "size": {
+    "height": 224,
+    "width": 224
+  }
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:20111e84a551ea8aa40b7c13479cf0a63c273323121e66cd5b3dd0143f930ed7
+size 5304