--- license: llama2 --- # Model Card for FFMPerative-7B ## Model Details This is a Llama 2 7B Large Language Model (LLM), fine-tuned specifically to automate video production workflows. It is designed to interact with FFMPerative, a tool that leverages machine learning and the FFmpeg software suite to perform a variety of video editing tasks using natural language input. ### Model Description - **Developed by:** [remyx.ai] - **Model type:** [LlaMA2-7B] - **License:** [Meta] - **Finetuned from model [optional]:** [LlaMA2] ## Uses The main use case for this model is to assist in video editing tasks. Users can leverage it to execute commands in natural language to FFMPerative for tasks such as cropping, resizing, rotating videos, making gifs, adjusting audio levels, and many more. The model can be particularly useful for people without technical skills, helping them interact with complex video editing tasks in a simplified, user-friendly manner. This checkpoint was fine-tuned on a subset of `HuggingFaceH4/CodeAlpaca_20K` augmented with 500 instances of FFMPerative Tool composition for practical video editing workflows. The training instances are based on various video editing tasks and their corresponding commands in FFMPerative, with example questions and answers demonstrating the interaction between a user and the video editing tool. Please refer to the GitHub repository readme for more examples of the training data used. ## Bias, Risks, and Limitations Please note that this model is designed for English language inputs and may not perform well with inputs in other languages. Although this model can interpret and execute a wide range of commands, it might sometimes struggle with ambiguous instructions, complex sequences of commands, or instructions for tasks that are not included in its training data. Please ensure you double-check the output of the model for critical tasks, and remember that it won't replace professional video editors for more advanced video editing workflows. ## How to Get Started with the Model Use the code below to get started. You can instantiate a local agent and pass additional tools: ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer, LocalAgent, load_tool model = AutoModelForCausalLM.from_pretrained("remyxai/ffmperative", device_map="auto", torch_dtype=torch.bfloat16, rope_scaling={"type": "dynamic", "factor": 2.0}, load_in_8bit=True) tokenizer = AutoTokenizer.from_pretrained("remyxai/ffmperative") # More tools in our spaces: https://huggingface.co/remyxai tools = [load_tool("remyxai/video-compression-tool"), load_tool("remyxai/video-frame-sample-tool")] agent = LocalAgent(model, tokenizer, additional_tools=tools) agent.run("Compress my video '/path/to/vid.mp4' and save it to '/path/to/compressed_vid.mp4'") ``` ## Training Details ### Training Data Training data is a combination of [HuggingFaceH4/CodeAlpaca_20K](https://huggingface.co/datasets/HuggingFaceH4/CodeAlpaca_20K) and our custom generated data reflecting the tools available in ffmperative - [remyxai/ffmperative](https://huggingface.co/datasets/remyxai/ffmperative) ### Training Procedure Using Parameter Efficient Fine-Tuning (PEFT), according to this [guide](https://huggingface.co/blog/llama2#fine-tuning-with-peft), we fine-tuned LlaMA2 with this [script](https://github.com/lvwerra/trl/blob/main/examples/scripts/sft_trainer.py). ## Evaluation We evaluated the model performance by measuring its ability to accurately interpret and execute video editing commands. Due to the proprietary nature of the evaluation process, specific metrics are not available. The model generally performs well, but please report any inconsistencies or errors you encounter when using the model. We appreciate your feedback and will use it to improve the model further. ### Model Architecture and Objective Meta's LlaMA2-7B ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed]