# Model Card for SVLM This model is a Seq2Seq Language Model (SVLM) fine-tuned to answer questions from the ACL research paper dataset. It generates responses related to academic research questions, making it useful for research and academic inquiry. ## Model Details ### Model Description - **Developed by:** @binarybardakshat - **Model type:** Seq2Seq Language Model (BART-based) - **Language(s) (NLP):** English - **License:** [More Information Needed] - **Finetuned from model:** facebook/bart-base ### Model Sources - **Repository:** [More Information Needed] ## Uses ### Direct Use This model can be directly used to answer questions based on research data from ACL papers. It is suitable for academic and research purposes. ### Out-of-Scope Use The model may not work well for general conversation or non-research-related queries. ## Bias, Risks, and Limitations The model may carry biases present in the training data, which consists of ACL research papers. It might not generalize well outside this domain. ### Recommendations Users should be cautious of biases and ensure that outputs align with their academic requirements. ## How to Get Started with the Model Use the code below to get started with the model: ```python from transformers import AutoModelForSeq2SeqLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("path_to_your_tokenizer") model = AutoModelForSeq2SeqLM.from_pretrained("path_to_your_model") ## Training Details ### Training Data The model was trained using the ACL dataset, which consists of research papers focused on computational linguistics. ### Training Procedure #### Training Hyperparameters - **Training regime:** fp32 - **Learning rate:** 2e-5 - **Epochs:** 3 - **Batch size:** 8 ## Evaluation ### Testing Data The model was evaluated on a subset of the ACL dataset, focusing on research-related questions. ### Metrics - **Accuracy** - **Loss** ### Results The model performs best in research-related question-answering tasks. Further evaluation metrics will be added as the model is used more widely. ## Environmental Impact - **Hardware Type:** GPU (NVIDIA V100) - **Hours used:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications ### Model Architecture and Objective The model is based on BART architecture, designed to perform sequence-to-sequence tasks like text summarization and translation. ### Compute Infrastructure #### Hardware - **NVIDIA V100 GPU** #### Software - **TensorFlow** - **Transformers** - **Safetensors**