We used the encoder from the XL model and the SSDD-B variant for the decoder.
The model was trained on pixiv images.
-
Base model