Prompt refinement is something I’ve been working on and its been tricky to get 3.5-turbo to adhere to my requirements; the images that get produced are pretty mid
Prompt refinement is something I’ve been working on and its been tricky to get 3.5-turbo to adhere to my requirements; the images that get produced are pretty mid
There are plenty of datasets, Just take the ones meant for stable diff training, rip out the prompt text, profit
Heres some high quality captions used for dalle3, etc:
https://huggingface.co/datasets/laion/dalle-3-dataset https://huggingface.co/datasets/laion/gpt4v-dataset https://huggingface.co/datasets/laion/wuerstchen-dataset https://huggingface.co/datasets/laion/220k-GPT4Vision-captions-from-LIVIS https://huggingface.co/datasets/laion/gpt4v-emotion-dataset