fffiloni's picture
Upload 244 files
b3f324b verified

A newer version of the Gradio SDK is available: 4.44.1

Upgrade

Refiner for Video Caption

Transform the short caption annotations from video datasets into the long and detailed caption annotations.

  • Add detailed description for background scene.
  • Add detailed description for object attributes, including color, material, pose.
  • Add detailed description for object-level spatial relationship.

πŸ› οΈ Extra Requirements and Installation

  • openai == 0.28.0
  • jsonlines == 4.0.0
  • nltk == 3.8.1
  • Install the LLaMA-Accessory:

you also need to download the weight of SPHINX to ./ckpt/ folder

πŸ—οΈ Refining

The refining instruction is in demo_for_refiner.py.

python demo_for_refiner.py --root_path $path_to_repo$ --api_key $openai_api_key$

Refining Demos

[original caption]: A red mustang parked in a showroom with american flags hanging from the ceiling.
[refine caption]: This scene depicts a red Mustang parked in a showroom with American flags hanging from the ceiling. The showroom likely serves as a space for showcasing and purchasing cars, and the Mustang is displayed prominently near the flags and ceiling. The scene also features a large window and other objects. Overall, it seems to take place in a car show or dealership.
  • Add GPT-3.5-Turbo for caption summarization. βŒ› [WIP]
  • Add LLAVA-1.6. βŒ› [WIP]
  • More descriptions. βŒ› [WIP]