• Z4rK@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    10 months ago

    All these examples are not just using stable diffusion though. They are using an LLM to create a generative image prompt for DALL-E / SD, which then gets executed. In none of these examples are we shown the actual prompt.

    If you instead instruct the LLM to first show the text prompt, review it and make sure the prompt does not include any elephants, revise it if necessary, then generate the image, you’ll get much better results. Now, ChatGPT is horrible in following instructions like these if you don’t set up the prompt very specifically, but it will still follow more of the instructions internally.

    Anyway, the issue in all the examples above does not stem from stable diffusion, but from the LLM generating an ineffective prompt to the stable diffusion algorithm by attempting to include some simple negative word for elephants, which does not work well.

    • Turun@feddit.de
      link
      fedilink
      arrow-up
      2
      ·
      10 months ago

      If you prompt stable Diffusion for “a room without elephants in it” you’ll get elephants. You need to add elephants to the negative prompt to get a room without them. I don’t think LLMs have been given the ability to add negative prompts