Expanding and enhancing Stable Diffusion with specialized models

Expanding and enhancing Stable Diffusion with specialized models

Now that you have Stable Diffusion 1.5 installed on your local system, have learned how to make cool generative prompts, it might be time to take the next step of trying different latent models.

There is more than one model out there for stable diffusion, and they can generate vastly different images:

Check out this article to learn how to install and use different popular models you can use with stable diffusion:

  • F222 – People found it useful in generating beautiful female portraits with correct body part relations. It’s quite good at generating aesthetically pleasing clothing.
  • Anything V3 – a special-purpose model trained to produce high-quality anime-style images. You can use danbooru tags (like 1girl, white hair) in text prompt.
  • Open Journey – a model fine-tuned with images generated by Mid Journey v4.
  • DreamShaper – model is fine-tuned for portrait illustration style that sits between photorealistic and computer graphics
  • Waifu-diffusion – Japanese anime style
  • Arcane Diffusion – TV show Arcane style
  • Robo Diffusion – Interesting robot style model that will turn everything your subject into robot
  • Mo-di-diffusion – Generate Pixar-like style models
  • Inkpunk Diffusion – Generate images in a unique illustration style
Better stable diffusion and AI generated art prompts

Better stable diffusion and AI generated art prompts

Now that you have stable diffusion on your system, how do you start taking advantage of it?

You can start with some good public examples.

This article from Metaverse gives you a list of excellent getting started guides to help get you from beginner to proficient in generating your own awesome art.

The key to it all is learning the syntax, parameters, and art of crafting AI prompts. It’s as much art as it is science. It’s complex enough that there are everything from beginner examples, free guides, tools to help, all the way to paid marketplaces.

Learning gotten a lot better in the last 6 months since people started learning how to use AI generated prompts last year.

Installing Stable Diffusion 1.5

Installing Stable Diffusion 1.5

To install Stable Diffusion 1.5 (released Oct 20, 2022) locally, I found this video was really excellent – except for a few points:

  1. You MUST use python 3.10.6 (I used 3.9.7 as recommended). The latest version (as of Feb 2023) is Python 3.11.1 – which stable diffusion does NOT seem to like and won’t run.

You might also want to read through this older stable diffusion 1.4 install guide, but he uses model checkpoints which haven’t been updated since version 1.4.

Gotchas and Fixes:

  • If you have an incompatible version of Python installed when you try to run webui-user.bat for the first time, stable diffusion will set itself up to point at this bad python version directory. Even if you uninstall and install the correct python version, stable diffusion will still look at the wrong python version. You can go fiddle with the different setup files – but it’s faster just to blow away the pulled git source at the top level and re-pull it to ensure you don’t have cruft laying around.

Installer Links:

Stable diffusion 2.0 was…well…

Stable diffusion 2.0 was…well…

Stable Diffusion 2.0 seems to have been a step backwards in capabilities and quality. Many people went back to v1.5 for their business.

The difficulty in 2.0 was in part caused by:

  1. Using a new language model that is trained from scratch
  2. The training dataset was heavily censored with a NSFW filter

The second part would have been fine, but the filter was quite inclusive and has removed substantial amount of good-quality data. 2.1 promised to bring the good data back.

Installing Stable Diffusion 2.1

If you’re interested in trying Stable Diffusion 2.1, use this tutorial to installing and use 2.1 models in AUTOMATIC1111 GUI, so you can make your judgement by using it.

You might also try this tutorial by TingTing

Links:

AI generated comic books

AI generated comic books

There’s a creative war going on surrounding AI generated art. While some are fighting AI generated art, others are fully embracing it.

AI Comic Books is a whole website/company dedicated to publishing comic books that rely on AI generated art. Check out the offerings on their website to see where the state of graphic novels is going.

This definitely spawns some discussions on where AI art is going to find it’s place in society. I think the cat is out of the bag; and now we’ll have to deal with the economic and moral questions it is generating; but I think that’s a discussion for another article…

Stable diffusion in other languages

Stable diffusion in other languages

Stable Diffusion was developed by CompVisStability AI, and LAION. It mainly uses the English subset LAION2B-en of the LAION-5B dataset for its training data and, as a result, requires English text prompts to producing images.

This means that the tagging and correlating of images and text are based on English tagged data sets – which naturally tend to come from English-speaking sources and regions. Users that use other languages must first use a translator from their native language to English – which often loses the nuances or even core meaning. On top of that, it also means the latent model images Stable Diffusion can use are usually limited to English-speaking region sources.

For example, one of the more common Japanese terms re-interpreted from the English word businessman is “salary man” which we most often imagine as a man wearing a suit. You would get results that look like this, which might not be very useful if you’re trying to generate images for a Japanese audience.

rinna Co., Ltd. has developed a Japanese-specific text-to-image model named “Japanese Stable Diffusion”. Japanese Stable Diffusion accepts native Japanese text prompts and generates images that reflect the naming and tagged pictures of the Japanese-speaking world which may be difficult to express through translation and whose images may simply not present in the western world. Their new text-to-image model was trained on source material that comes directly from Japanese culture, identity, and unique expressions – including slang.

They did this by using a two step approach that is instructive on how stable diffusion works.

First, the latent diffusion model is left alone and they replaced the English text encoder with a Japanese-specific text encoder. This allowed the text encoder to understand Japanese natively, but would still generate western style tagged images because the latent model remained intact. This was still better than just translating the stable diffusion prompt.

Now Stable Diffusion could understand what the concept of a ‘businessman’ was but it still generated images of decidedly western looking businessmen because the underlying latent diffusion model had not been changed:

The second step was to retrain the the latent diffusion model from more Japanese tagged data sources with the new text encoder. This stage was essential to make the model become more language-specific. After this, the model could finally generate businessmen with the Japanese faces they would have expected:

Read more about it on the links below.

Links:

A.I. coming to a bed near you

A.I. coming to a bed near you

Bryte Balance bed proports to use A.I. to sense pressure imbalances of those laying on the bed and then automatically controls a number of adjustable ‘rebalancers’ that give anyone laying on it a better night’s sleep.

Combine that with some ultra lux phone app controls and you got the making of a luxury bed being used at some of the top luxury hotels and resorts in the world – like the 5 star Park Hyatt New York and Carillon Miami.

You can own one yourself if you like. It’ll only set you back $6,299

Wata Games being sued

Wata Games being sued

I have written about multi-million dollar retro game sales by Wata before. At the time, Karl Jobst, Wired magazine, and other sites like Kotaku have written about the business models that have lead to extremely questionable multi-million dollar sales of vintage video games by companies like Wata.

Well, it turns out someone has noticed and now there is a lawsuit. One can only hope Wata and these sorts of schemes get busted for any criminal wrongdoing they seem very highly likely to have committed.

If you care at all about collectors markets like retro video games, I highly suggest you read up on grading, auction, and fractional sales operations like Wata and their partners.

Nintendo VirtualBoy emulator – on Oculus

Nintendo VirtualBoy emulator – on Oculus

Nintendo’s VirtualBoy wasn’t exactly a smashing success. It was, however, one of the first forays into mass consumer 3D gaming. Sadly, the hardware was really limited and most people had lots of issues using the devices.

Fast forward, and someone decided to make a VirtualBoy emulator on the Oculus. Meet VirtualBoy Go. You can play your original virtualboy roms using a modern oculus instead of the fiddly and hard to find Virtualboy.

You can check out this stream where a speed runner runs the wonderfully awful Waterworld game: https://www.twitch.tv/videos/1594810061

RAII: Resource Acquisition is Initialization

RAII: Resource Acquisition is Initialization

This is a great little video from the Back to Basics series offered by CppCon. They even have their slides and code on github.

CppCon has a bunch of other great ‘Back to Basics’ videos that cover a whole host of great topics: safe exception handling, move semantics, type erasure, lambdas, and a bunch of other critical but oft misunderstood elements of C++

In this video, you get a refresher on RAII.

“Resource Allocation is Initialization is one of the cornerstones of C++. What is it, why is it important, and how do we use it in our own code?”