Module candle_transformers::models::clip::vision_model
source · Expand description
Contrastive Language-Image Pre-Training
Contrastive Language-Image Pre-Training (CLIP) is an architecture trained on pairs of images with related texts.
https://github.com/openai/CLIP https://github.com/huggingface/transformers/tree/f6fa0f0bf0796ac66f201f23bdb8585de1609add/src/transformers/models/clip