Spaces:

touhou-ai-experimental
/

research-paper

Running

App Files Files Community

research-paper / index.html

parsee-mizuhashi

Upload 2 files

80949ae over 1 year ago

raw

history blame

4.21 kB

	<!DOCTYPE html>
	<html lang="en">
	<head>
	<meta charset="UTF-8">
	<meta name="viewport" content="width=device-width, initial-scale=1.0">
	<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css">
	<link href="https://fonts.googleapis.com/css2?family=Montserrat:wght@500&display=swap" rel="stylesheet">
	<link rel="stylesheet" href="style.css">
	<title>M.o.f.u.</title>
	</head>
	<body>
	<h1 class="header-title">M.o.f.u.</h1>
	<p class="header-subtitle"><span class="highlight-orange">Mo</span>del independent, <span class="highlight-violate">F</span>ast T<span class="highlight-orange">u</span>ning of Stable Diffusion concepts</p>
	<section id="abstract">
	<h2><i class="icon fas fa-file-alt"></i> Abstract</h2>
	<p>I present MoFu, a model-independent, fast tuning
	approach that enhances Stable Diffusion. Compared
	to other more traditional methods, such as Low Rank
	adaptation for the model or fine tuning it, MoFu
	doesn’t modify the weights of the main model at all.
	MoFu seamlessly integrates with Stable Diffusion's
	text encoder, enabling rapid style/concept addition
	without modifying or fine-tuning the encoder's
	weights</p>
	</section>

	<section id="methodology">
	<h2><i class="icon fas fa-flask"></i> Methodology</h2>
	<p>The methodology of MoFu revolves around a simple
	yet effective process. To achieve the desired results,
	we begin by comparing natural prompts given to a
	set of images. This comparison allows us to extract
	the essential concepts or styles from the text
	prompts. These identified concepts are then stored in
	a mixin, creating a compact representation of the
	desired style information. The mixin is designed to be
	compatible with Stable Diffusion's architecture and
	serves as an additive to the text encoder’s output.
	By adding the mixin with the text encoder’s output
	(the mixin, or MoFu model, can also be multiplied by
	a weight, in order to make its effect stronger or
	weaker), MoFu efficiently injects the extracted
	concepts into the image generation process. This
	injection enables Stable Diffusion to generate images
	with the desired style without altering the underlying
	weights of the main model. As a result, MoFu
	provides a powerful and flexible solution for style
	transfer or concept addition in Stable Diffusion
	without the need for extensive model modifications
	or resource-intensive fine-tuning.</p>
	</section>

	<section id="results">
	<h2><i class="icon fas fa-chart-bar"></i> Results</h2>
	<p>To evaluate the effectiveness of MoFu, I conducted a
	series of experiments and compared its performance
	to LoRA and fine-tuning methods. Our results
	demonstrate that MoFu achieves comparable
	performance to LoRAs while requiring significantly
	less training time, taking only around 10-20 seconds
	on average, primarily due to being CPU-bound. This
	is in stark contrast to LoRAs, which typically demand
	several hours to train. However, I also observed that
	MoFu falls short of fine-tuning, as the latter can
	achieve even better precision/quality but at the cost
	of a much longer training.
	</p>
	</section>

	<section id="conclusion">
	<h2><i class="icon fas fa-clipboard-check"></i> Conclusion</h2>
	<p>In conclusion, MoFu offers an efficient and
	model-independent solution for adding new styles or
	concepts to Stable Diffusion without modifying the
	main model's weights. It achieves comparable results
	to LoRA while significantly reducing training time,
	making it a practical choice for rapid adaptation.
	Though fine-tuning still outperforms MoFu in quality,
	the trade-off between speed and accuracy makes
	MoFu a valuable option for various applications.
	Future work may focus on optimizing the
	implementation / quality of MoFu.</p>
	</section>

	<footer>
	<div class="buttons">
	<a href="mailto:parsee.mizuhashi.th11@gmail.com" class="button">Yoinked</a>
	<a href="https://huggingface.co/organizations/touhou-ai-experimental" class="button">Touhou AI Experimental Group</a>
	<a href="https://huggingface.co/mofu-team" class="button">MoFu</a>
	<a href="https://github.com/yoinked-h/MoFu" class="button" target="_blank">GitHub</a>
	</div>
	</footer>
	</body>
	</html>