Edit model card
  • In this experiment i explored reward modeling.
  • Sliced out half of the layers off the gpt2 model using mergekit. ( Now: "n_layer": 6)
  • The idea is that i want my teeny tiny model to assign a higher score to the chosen samples compared to the rejected ones.
  • There's been improvement in how the model ranks output by experimenting with log-sigmoid.

Example 1:

prompt = "What is the national bird of Belize?"
answer1 = "The national bird of Belize is the Keel Billed Toucan. The Toucan is recognized for its vividly colored bill, which is shaped like a canoe and features hues of yellow, orange, red, green, and black."
answer2 = "The national bird of Belize is the Keel-billed Toucan (Ramphastos sulfuratus)."

logits = calc_reward(model, tokenizer, prompt, answer1, answer2)
print(logits)

Output:

The model prefers 'The national bird of Belize is the Keel Billed Toucan. The Toucan is recognized for its vividly colored bill, which is shaped like a canoe and features hues of yellow, orange, red, green, and black.' with a probability of 0.6774
tensor([[-0.8621, -1.6038]], device='cuda:0')

Example 2:

prompt = "Who directed the movie Pulp Fiction and what is it about?"
answer1 = "Pulp Fiction is a critically acclaimed film directed by Quentin Tarantino in 1994. Known for its eclectic dialogue, ironic mix of humor and violence, nonlinear storyline, and a host of cinematic and pop culture references, the movie significantly boosted the director's reputation. The plot interweaves several stories involving Los Angeles mobsters, fringe characters, petty criminals, and a mysterious briefcase. Its iconic characters, such as hitmen Vincent Vega and Jules Winnfield, have left a lasting impact on popular culture."
answer2 = "The movie Pulp Fiction was directed by Quentin Tarantino. Released in 1994, it is a neo-noir black comedy crime film. The movie follows several interconnected storylines involving two hitmen, a boxer, a gangster's wife, and a pair of armed robbers. The narrative structure is non-linear, with events presented out of chronological order. Pulp Fiction is known for its witty dialogue, eclectic soundtrack, and its exploration of themes such as violence, redemption, and pop culture references."
logits = calc_reward(model, tokenizer, prompt, answer1, answer2)
print(logits)

Output:

The model prefers 'Pulp Fiction is a critically acclaimed film directed by Quentin Tarantino in 1994. Known for its eclectic dialogue, ironic mix of humor and violence, nonlinear storyline, and a host of cinematic and pop culture references, the movie significantly boosted the director's reputation. The plot interweaves several stories involving Los Angeles mobsters, fringe characters, petty criminals, and a mysterious briefcase. Its iconic characters, such as hitmen Vincent Vega and Jules Winnfield, have left a lasting impact on popular culture.' with a probability of 0.6886
tensor([[-0.2421, -1.0356]], device='cuda:0')
Downloads last month
2
Safetensors
Model size
81.9M params
Tensor type
F32
·

Dataset used to train aloobun/gpt2-small-no-robots-rlaif