Hi,

I’m a bit struggling to get good results with SDXL and I’m wondering if I do something wrong … I tried A1111 and ComfyUI and have been underwhelmed in both cases. I can get bland looking boring images out of it, which seem to be ok from a technical point of view (like they seem to be correctly generated, without weird artifacts or something like that). However, whenever I try to get something more elaborate my prompting leeds to nowhere. Like I can get “a cat” and it will generate a picture of a cat. But if I try to get “a cat wearing a wizard hat floating in a mesmerizing galaxy of candy pops” - these kind of prompts seem to quickly break the final image. I’m not talking about tailored models and LoRa here, but I seem to be able to do much more interesting stuff with the Deliberate 2.0 model than with SDXL.

So, what’s your experience so far? Does the community need to catch up first and do work on custom models, LoRa, and so on to really get thinks cooking? Or do I need to learn better how to work with XL? I was actually looking forward to have a “bland” and hopefully rather unbiased model to work with where not every prompt desperately trys to become a hot anime girl, but I’m struggling to get interesting images for now.

For reference, I updated my A1111 installation with “git pull” (which seems to have worked, as I now have a SDXL tab in my settings) and downloaded the 1.0 model, refiner and VAE from huggingface. I can generate text2imgage in A1111 with the base model, however I can’t seem to get the img2img with the refiner model to work … On ComfyUI I found a premade workflow that uses the base model first and the refiner from the latent and which seems to work just fine technically, but also seems to require a different approach to prompting than I’m used to.