volrath50
/

fantasy-card-diffusion

@@ -14,7 +14,7 @@ tags:
 ### For a guide on using the model, scroll down below
-![Thumbnail](https://huggingface.co/volrath50/fantasy-card-diffusion/collage.jpg)
 ## Features
 - Incorporate the styles of artists you know and love from Magic: Gathering
@@ -31,21 +31,49 @@ The model was trained on MtG card information, not art descriptions. This has th
 Each card was trained with card information pulled from Scryfall in the following format:
-MTG card art, [Card Name], by [Artist], [year], [colors (words)], [colors (letters)], [card type], [rarity], [set name], [set code], [plane], [set type*], [watermark*], [mana cost], [security stamp*], [power/toughness*], [keywords*], [promo type*], [story spotlight*]
 A few examples of actual card data in this format:
 MTG card art, Ayula, Queen Among Bears, by Jesper Ejsing, 2019, Green, G, Legendary Creature - Bear, rare, Modern Horizons, mh1,  draft_innovation,  1G,  None, 2/2, Fight,
 MTG card art, Force of Will, by Terese Nielsen, 1996, Blue, U, Instant, uncommon, Alliances, all, Dominaria, Terisiare, Ice Age, expansion,  3UU,
 ## Training and dataset
-Training was done on a dataset consisting of cropped, 512x512 versions of the art for every MTG card, each of which was tagged using a custom python script, from data pulled from Scryfall. Training was done with the Dreambooth extension for Automatic1111's wonderful UI, to 130,000 steps.
 The result is a comprehensive model that has a good understanding of MTG artists, sets, planes, card types, creature types, years, colors, and more. If you had ever wondered what a Merfolk, drawn by Ron Spencer, would have looked like on Tarkir, as part of the Mardu clan, with dash, haste, and trample - this model can deliver what you want.
 Because the training data is literally the art from every MTG card, combined with the information for the associated card (about ~35,000 unique pieces of art and text), I won't be releasing the training data, out of concerns that would be violating WotC's IP. I have, however, included the python script that I used to generate the training data set, which should get you uncropped images and identical text (or near identical) text files, with used with the "unique artwork" json from https://scryfall.com/docs/api/bulk-data
-My script could probably be written much better. I have never studied python, and, up until I wanted to make this model, had not done any programming at since about 2000-2001, when I was a teenager and liked Perl. I managed to hack it together in a weekend using 20+ year old memories of Perl, liberal use of Github Copilot, and a lot of googling.
 Cropping was done with ImageMagick (see below, under issues).
@@ -53,7 +81,7 @@ Cropping was done with ImageMagick (see below, under issues).
 This was intended to be a second test run on the full data set (the first did not go well), so some corners were cut for the purpose of starting my "testing." The model turned out far better than I had expected, so I've decided to release it as is, and hope other people enjoy it as much as I have. But there are some issues that I am aware of and intend to work on fixing for future releases:
 - Cropping
--- MTG art is rectangular. I initially tried to use a trainer that could handle different aspect ratios, but after a couple failed tries, I just did a quick mass cropping job with ImageMagick, resizing and cropping everything to 512x512, so I could get training running. I forget what exactly I did, but it appears it focused on the left side of the card, universally cutting off the right side. You'll see this in generations, that tend to have everything on the right, as that's the image the training robot saw. I will fix this in future releases.
 - Planes
 -- Plane information was only added around step 70,000, so it may be less trained than other information - basically, I wanted a way to group sets together by plane, as I was finding how well it knew the look of a set depended on whether WotC had incorporated the name of the plane into the set itself - ie: using "Theros" would only get you "Theros" and "Theros: Beyond Death" and not "Born of the Gods" or "Journey into Nyx"
 - Unique Characters

 ### For a guide on using the model, scroll down below
+![Thumbnail](https://huggingface.co/volrath50/fantasy-card-diffusion/images/collage.jpg)
 ## Features
 - Incorporate the styles of artists you know and love from Magic: Gathering
 Each card was trained with card information pulled from Scryfall in the following format:
+MTG card art, [Card Name], by [Artist], [year], [colors (words)], [colors (letters)], [card type], [rarity], [set name], [set code], [plane], [set type*], [watermark], [mana cost], [security stamp], [power/toughness], [keywords], [promo type], [story spotlight]
 A few examples of actual card data in this format:
 MTG card art, Ayula, Queen Among Bears, by Jesper Ejsing, 2019, Green, G, Legendary Creature - Bear, rare, Modern Horizons, mh1,  draft_innovation,  1G,  None, 2/2, Fight,
 MTG card art, Force of Will, by Terese Nielsen, 1996, Blue, U, Instant, uncommon, Alliances, all, Dominaria, Terisiare, Ice Age, expansion,  3UU,
+To briefly explain the entries:
+Every card art is tagged at the start with "MTG card art". Usually you want to use this. It does generalize the image a bit, however. Experiment with using it and not using it. Sometimes, if you are having trouble making something look distinctly "Tarkir" or something, taking off this tag can help de-generalize the art.
+Set type: this is usually "expansion". Other possibilities are "core", "funny", and some other. You can check the Scryfall API documents for more information.
+Security stamp: I translated some of these for ease of use. The main two of note are "acorn" and "universes beyond". There are a few other rare stamps, like one for the My Little Pony cards.
+Story Spotlight: cards that are a story spotlight are tagged as such. This wasn't really worth including, and I'll probably take it out of a future version of the model.
+Pretty much every tag from normal Stable Diffusion still works as expected (ie, extremely detailed, intricate details). I've found adding "beautiful composition" tends to make things look nice, but I'm sure everyone has their own set of personal tags they like to use - they should work with this model.
+I like to write my prompts like an art description - you can see in the examples I made up below.
+##Example Images and Prompts
+Full generation parameters should be in the images. All these examples were made with Automatic1111's UI, fantasy-card-diffusion-140000.ckpt, and DPM++2S a sampler. CFG varies - I find around 11 works as a good baseline. Most of these were done with around 40-50 steps - probably overkill.
+#Ascended Eldrazi
+![Thumbnail](https://huggingface.co/volrath50/fantasy-card-diffusion/images/collage.jpg)
+#Emrakul, Compleated Doom
+![Thumbnail](https://huggingface.co/volrath50/fantasy-card-diffusion/30393-3410912099-DPM++ 2S a Karras-s35-c14-512x512-m03f434be-mtg card art emrakul 1 2 compleated 1 1 doom by seb mckinnon 1 1 legendary creature phyrexian 1 1.png)
+mtg card art, (emrakul:1.2), (compleated:1.1) doom, (by seb mckinnon:1.1), legendary creature - (phyrexian:1.1) (eldrazi:1.2) (horror:1.1), black, (strixhaven, arcivos:1.2), annihilator, (infect:1.2), 15/15, a (phyrexianized:1.1), compleated Emrakul, attacking (strixhaven school, university campus:1.2), stx, beautiful composition, detailed painting, (sense of scale:1.2), horror, dark, terrifying, eldritch horror, new phyrexia, nph, rise of the eldrazi, roe, extremely detailed, intricate details, masterpiece, best quality, emrakul, the aeons torn, emrakul, the promised end
+Negative prompt: zendikar, water, ocean, funny, happy, optimistic, bright, tentacles, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, octopus, spikes, urchin, tentacles, arms, hands, legs
 ## Training and dataset
+Training was done on a dataset consisting of cropped, 512x512 versions of the art for every MTG card, each of which was tagged using a custom python script, from data pulled from Scryfall. Training was done with the Dreambooth extension for Automatic1111's wonderful UI, to 140,000 steps, over the course of a couple days, on my 4090. I changed settings several times as I went, generally increasing batch size and lowering learning rate. At the moment, I am at batch size 10, gradient accumulation 5, and learning rate 4e-7, and that seems to be working well.
 The result is a comprehensive model that has a good understanding of MTG artists, sets, planes, card types, creature types, years, colors, and more. If you had ever wondered what a Merfolk, drawn by Ron Spencer, would have looked like on Tarkir, as part of the Mardu clan, with dash, haste, and trample - this model can deliver what you want.
 Because the training data is literally the art from every MTG card, combined with the information for the associated card (about ~35,000 unique pieces of art and text), I won't be releasing the training data, out of concerns that would be violating WotC's IP. I have, however, included the python script that I used to generate the training data set, which should get you uncropped images and identical text (or near identical) text files, with used with the "unique artwork" json from https://scryfall.com/docs/api/bulk-data
+The script is simple, and could probably be improved. I hadn't done any coding for 20 years, since I was a teenager, and had never used Python prior to hacking this together with vague memories of Perl in 2000-2001, liberal use of Github co-pilot and lots of googling.
 Cropping was done with ImageMagick (see below, under issues).
 This was intended to be a second test run on the full data set (the first did not go well), so some corners were cut for the purpose of starting my "testing." The model turned out far better than I had expected, so I've decided to release it as is, and hope other people enjoy it as much as I have. But there are some issues that I am aware of and intend to work on fixing for future releases:
 - Cropping
+-- MTG art is rectangular. I initially tried to use a trainer that could handle different aspect ratios, but after a couple failed tries, I just did a quick mass cropping job with ImageMagick, resizing and cropping everything to 512x512, so I could get training running. I forget what exactly I did, but it appears it focused on the left side of the card, universally cutting off the right side. You'll see this in lots of images, that tend to have everything on the right as a result
 - Planes
 -- Plane information was only added around step 70,000, so it may be less trained than other information - basically, I wanted a way to group sets together by plane, as I was finding how well it knew the look of a set depended on whether WotC had incorporated the name of the plane into the set itself - ie: using "Theros" would only get you "Theros" and "Theros: Beyond Death" and not "Born of the Gods" or "Journey into Nyx"
 - Unique Characters