FoodDesert commited on
Commit
5232b3e
1 Parent(s): b754714

Upload app.py

Browse files

Fixed an issue where artist suggestions were not reflecting tags containing escape characters.

Files changed (1) hide show
  1. app.py +9 -5
app.py CHANGED
@@ -34,7 +34,7 @@ Some models react best when prompted with verbose scene descriptions akin to DAL
34
  This tool serves as a linguistic bridge to the e621 image board tag lexicon, on which many popular models such as Fluffyrock, Fluffusion, and Pony Diffusion v6 were trained.
35
 
36
  When you enter a txt2img prompt and press the "submit" button, the Tagset Completer parses your prompt and checks that all your tags are valid e621 tags.
37
- If it finds any that are not, it recommends some valid e621 tags you can use to replace them in the "Unseen Tags" table.
38
  Additionally, in the "Top Artists" text box, it lists the artists who would most likely draw an image having the set of tags you provided.
39
  This is useful to align your prompt with the expected input to an e621-trained model.
40
 
@@ -52,7 +52,7 @@ Yes, but only '(' and ')' and numerical weights, and all of these things are ign
52
  An example that illustrates acceptable parentheses and weight formatting is:
53
  ((sunset over the mountains)), (clear sky:1.5), ((eagle flying high:2.0)), river, (fish swimming in the river:1.2), (campfire, (marshmallows:2.1):1.3), stars in the sky, ((full moon:1.8)), (wolf howling:1.7)
54
 
55
- ## Why are some valid tags marked as "unseen", and why don't some artists ever get returned?
56
 
57
  Some data is excluded from consideration if it did not occur frequently enough in the sample from which the application makes its calculations.
58
  If an artist or tag is too infrequent, we might not think we have enough data to make predictions about it.
@@ -479,6 +479,7 @@ def build_tag_offsets_dicts(new_image_tags_with_positions):
479
  for tag_text, start_pos in new_image_tags_with_positions:
480
  # Modify the tag
481
  modified_tag = tag_text.replace('_', ' ').replace('\\(', '(').replace('\\)', ')').strip()
 
482
  # Calculate the end position based on the original tag length
483
  end_pos = start_pos + len(tag_text)
484
  # Append the structured data for each tag
@@ -486,7 +487,8 @@ def build_tag_offsets_dicts(new_image_tags_with_positions):
486
  "original_tag": tag_text,
487
  "start_pos": start_pos,
488
  "end_pos": end_pos,
489
- "modified_tag": modified_tag
 
490
  })
491
  return tag_data
492
 
@@ -508,8 +510,10 @@ def find_similar_artists(original_tags_string, top_n, similarity_weight, allow_n
508
  bad_tags_illustrated_string = {"text":new_tags_string, "entities":bad_entities}
509
  #bad_tags_illustrated_string = {"text":original_tags_string, "entities":bad_entities}
510
 
511
- modified_tags = [tag_info['modified_tag'] for tag_info in tag_data]
512
- X_new_image = vectorizer.transform([','.join(modified_tags + removed_tags)])
 
 
513
  similarities = cosine_similarity(X_new_image, X_artist)[0]
514
 
515
  top_artist_indices = np.argsort(similarities)[-(top_n + 1):][::-1]
 
34
  This tool serves as a linguistic bridge to the e621 image board tag lexicon, on which many popular models such as Fluffyrock, Fluffusion, and Pony Diffusion v6 were trained.
35
 
36
  When you enter a txt2img prompt and press the "submit" button, the Tagset Completer parses your prompt and checks that all your tags are valid e621 tags.
37
+ If it finds any that are not, it recommends some valid e621 tags you can use to replace them in the "Unknown Tags" section.
38
  Additionally, in the "Top Artists" text box, it lists the artists who would most likely draw an image having the set of tags you provided.
39
  This is useful to align your prompt with the expected input to an e621-trained model.
40
 
 
52
  An example that illustrates acceptable parentheses and weight formatting is:
53
  ((sunset over the mountains)), (clear sky:1.5), ((eagle flying high:2.0)), river, (fish swimming in the river:1.2), (campfire, (marshmallows:2.1):1.3), stars in the sky, ((full moon:1.8)), (wolf howling:1.7)
54
 
55
+ ## Why are some valid tags marked as "unknown", and why don't some artists ever get returned?
56
 
57
  Some data is excluded from consideration if it did not occur frequently enough in the sample from which the application makes its calculations.
58
  If an artist or tag is too infrequent, we might not think we have enough data to make predictions about it.
 
479
  for tag_text, start_pos in new_image_tags_with_positions:
480
  # Modify the tag
481
  modified_tag = tag_text.replace('_', ' ').replace('\\(', '(').replace('\\)', ')').strip()
482
+ artist_matrix_tag = tag_text.replace('_', ' ').replace('\\(', '\(').replace('\\)', '\)').strip()
483
  # Calculate the end position based on the original tag length
484
  end_pos = start_pos + len(tag_text)
485
  # Append the structured data for each tag
 
487
  "original_tag": tag_text,
488
  "start_pos": start_pos,
489
  "end_pos": end_pos,
490
+ "modified_tag": modified_tag,
491
+ "artist_matrix_tag": artist_matrix_tag
492
  })
493
  return tag_data
494
 
 
510
  bad_tags_illustrated_string = {"text":new_tags_string, "entities":bad_entities}
511
  #bad_tags_illustrated_string = {"text":original_tags_string, "entities":bad_entities}
512
 
513
+ #modified_tags = [tag_info['modified_tag'] for tag_info in tag_data]
514
+ #X_new_image = vectorizer.transform([','.join(modified_tags + removed_tags)])
515
+ artist_matrix_tags = [tag_info['artist_matrix_tag'] for tag_info in tag_data]
516
+ X_new_image = vectorizer.transform([','.join(artist_matrix_tags + removed_tags)])
517
  similarities = cosine_similarity(X_new_image, X_artist)[0]
518
 
519
  top_artist_indices = np.argsort(similarities)[-(top_n + 1):][::-1]