Model Card for Model ID

Attempts to extract metadata; keywords, description and header count

Model Details

Model Description

  • Developed by: Israel N.
  • Model type: Llama-2-7B
  • Language(s) (NLP): English
  • License: Apache-2.0
  • Finetuned from model [optional]: TinyPixel/Llama-2-7B-bf16-sharded

Uses

Direct Use

Expediting offline SEO analysis

Bias, Risks, and Limitations

Currently does not respond to site or metadata, might need a more refined dataset to work.

How to Get Started with the Model

!pip install -q -U trl transformers accelerate git+https://github.com/huggingface/peft.git
!pip install -q datasets bitsandbytes einops

Import and use the AutoModelForCausalLM.pretrained to load the model from "israelNwokedi/Llama2_Finetuned_SEO_Instruction_Set".

Training Details

Training Data

Prompts: Entire sites and backlinks scrapped from the web Outputs: Keywords, description, header counts (h1-h6).

These are the main components of the dataset. Additional samples are ChatGPT-generated metadata as prompts and the relevant outputs.

Training Procedure

Finetuning of pre-trained "TinyPixel/Llama-2-7B-bf16-sharded" huggingface model using LoRA and QLoRA.

Preprocessing [optional]

Used Transformers' BitsAndBytesConfig for lightweight model training and "TinyPixel/Llama-2-7B-bf16-sharded" tokenizer for encoding/decoding.

Training Hyperparameters

  • Training regime: 4-bit precision

Testing Data, Factors & Metrics

Testing Data

Sampled from training data.

Metrics

Not yet computed.

[More Information Needed]

Results

Intial test attempted reconstructing another artiicial metadata as part of its text generation function however this was not the intended usecase.

Environmental Impact

  • Hardware Type: Tesla T4
  • Hours used: 0.5
  • Cloud Provider: Google Colaboratory
  • Compute Region: Eurpoe
  • Carbon Emitted: 0.08
Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.