Datapool questions

#33
by Lopovos - opened
  1. Where does the ai take data from
  2. Is the datapool fixed
  3. Does the ai purposely avoid some keywords (pixelated, many proper nouns, nsfw all don’t seem to give relevant results from what me and others have seen).
  1. The data is from common crawl ( see Laion2B )
    it's not updated live.

  2. Datapool fixed? If you are referring to the model training and seeing the same images over and over again every epoch, yeah.

  3. The model uses a heavily filtered version of Laion2B dataset to avoid bad outputs.

Sign up or log in to comment