Florence-2-large-FormClassification-ft

This model is a fine-tuned version of microsoft/Florence-2-large-ft on an Musa07/Florence_ft dataset. It achieves the following results on the evaluation set:

Loss: 0.2107

Inference Code

  # Code
    from transformers import AutoProcessor, AutoModelForCausalLM
    import matplotlib.pyplot as plt
    import matplotlib.patches as patches
  
    model = AutoModelForCausalLM.from_pretrained("Musa07/Florence-2-large-FormClassification-ft", trust_remote_code=True, device_map='cuda') # Load the model on GPU if available
    processor = AutoProcessor.from_pretrained("Musa07/Florence-2-large-FormClassification-ft", trust_remote_code=True)
  
    def run_example(task_prompt, image, max_new_tokens=128):
      prompt = task_prompt
      inputs = processor(text=prompt, images=image, return_tensors="pt")
      generated_ids = model.generate(
        input_ids=inputs["input_ids"].cuda(),
        pixel_values=inputs["pixel_values"].cuda(),
        max_new_tokens=max_new_tokens,
        early_stopping=False,
        do_sample=False,
        num_beams=3,
      )
      generated_text = processor.batch_decode(generated_ids, skip_special_tokens=False)[0]
      parsed_answer = processor.post_process_generation(
          generated_text,
          task=task_prompt,
          image_size=(image.width, image.height)
      )
      return parsed_answer
      
    def plot_bbox(image, data):
      fig, ax = plt.subplots()
  
      # Display the image
      ax.imshow(image)
  
      # Plot each bounding box
      for bbox, label in zip(data['bboxes'], data['labels']):
          # Unpack the bounding box coordinates
          x1, y1, x2, y2 = bbox
          # Create a Rectangle patch
          rect = patches.Rectangle((x1, y1), x2-x1, y2-y1, linewidth=1, edgecolor='r', facecolor='none')
          # Add the rectangle to the Axes
          ax.add_patch(rect)
          # Annotate the label
          plt.text(x1, y1, label, color='white', fontsize=8, bbox=dict(facecolor='red', alpha=0.5))
  
      # Remove the axis ticks and labels
      ax.axis('off')
  
      # Show the plot
      plt.show()
      
    image = Image.open('1.jpeg')
    parsed_answer = run_example("<OD>", image=image)
    print(parsed_answer)
    plot_bbox(image, parsed_answer["<OD>"])

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-06
train_batch_size: 24
eval_batch_size: 24
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss
0.0188	1.0	23	0.2151
0.0127	2.0	46	0.2113
0.0078	3.0	69	0.2061
0.0047	4.0	92	0.2102
0.0042	5.0	115	0.2078
0.003	6.0	138	0.2108
0.0022	7.0	161	0.2110
0.0029	8.0	184	0.2117
0.0019	9.0	207	0.2114
0.0023	10.0	230	0.2107

Framework versions

Transformers 4.44.0.dev0
Pytorch 2.3.1+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Musa07
/

Florence-2-large-FormClassification-ft

Florence-2-large-FormClassification-ft

Inference Code

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Musa07/Florence-2-large-FormClassification-ft

Evaluation results