--- library_name: transformers inference: false license: cc-by-sa-4.0 --- **Swiftstrike Aero Model (Falcon Pruned Model)** This model is a fine-tuned version of the Swiftstrike Aero Model, specifically tailored for context-aware keyword searches related to culture. It's designed to process 1-block contexts, equivalent to approximately 384 tokens or a single paragraph of wikipedia standard(common length paragraph). **Training Data (Part 1 Culture Context Wikipedia)** The model was trained on a multi-stage dataset derived from Wikipedia's culture-related content: 1. **Base Dataset:** - 13,000 rows of capitalized and lowercase words extracted from Wikipedia's culture sentences. 2. **Sentence-Level Dataset:** - 2,300 rows of full sentences from Wikipedia's culture data. 3. **1-Block Context Dataset:** - 500 rows of 1-block contexts (approximately 1 paragraph) from Wikipedia's culture data. **Dataset Organization** The dataset is structured hierarchically, with each level representing an increasing level of complexity: 1. **Part:** Individual components or elements. 2. **Merge Part:** Combination of two or more parts. 3. **Fragment:** Combination of two or more merge parts. 4. **Sub-Unit:** Combination of two or more fragments. 5. **Unit:** Combination of two or more sub-units. 6. **Super-Unit:** Combination of two or more units. 7. **Mega-Unit:** Combination of two or more super-units. **How to Use** ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer from IPython.display import display, HTML # prompt: load model and generate example model_name = "nqzfaizal77ai/sa-145m-en-wikipedia-culture-part1-1bc" model = AutoModelForCausalLM.from_pretrained(model_name,trust_remote_code=True) tokenizer = AutoTokenizer.from_pretrained(model_name,trust_remote_code=True) torch.manual_seed(3077) # Example usage stochastic decode input_text = "The cultural impact of the internet is" inputs = tokenizer(input_text, return_tensors="pt") output = model.generate(**inputs, do_sample=True, top_k=50, top_p=0.95, repetition_penalty=1.2, max_length=100) def print_with_border(text): """Prints the given text with a border around it.""" display(HTML(f"