Conversational speech generation
Create 3D reconstructions from videos or images
Compare SigLIP1 and SigLIP2 on zero shot classification