Get Pretrained Txt/Img Encoder from ๐ค Transformers¶
This MindSpore patch for ๐ค Transformers enables researchers or developers in the field of text-to-image (t2i) and text-to-video (t2v) generation to utilize pretrained text and image models from ๐ค Transformers on MindSpore. The pretrained models from ๐ค Transformers can be employed either as frozen encoders or fine-tuned with denoising networks for generative tasks. This approach aligns with the practices of PyTorch users[1][2]. Now, MindSpore users can benefit from the same functionality!
Philosophy¶
- Only the MindSpore model definition will be implemented, which will be identical to the PyTorch model.
- Configuration, Tokenizer, etc. will utilize the original ๐ค Transformers.
- Models here will be limited to the scope of generative tasks.