Keywords: Control, text-to-image, video generation, diffusion transformer, image generation ...