Spaces:
				
			
			
	
			
			
		Runtime error
		
	
	
	
			
			
	
	
	
	
		
		
		Runtime error
		
	Update app.py
Browse filesVision-Langauge-Models --> Vision-Language-Models
    	
        app.py
    CHANGED
    
    | 
         @@ -95,7 +95,7 @@ with gr.Blocks() as demo: 
     | 
|
| 95 | 
         
             
            [website](https://pivot-prompt.github.io/)
         
     | 
| 96 | 
         
             
            [view on huggingface](https://huggingface.co/spaces/pivot-prompt/pivot-prompt-demo/)
         
     | 
| 97 | 
         | 
| 98 | 
         
            -
            The demo below showcases a version of the PIVOT algorithm, which uses iterative visual prompts to optimize and guide the reasoning of Vision- 
     | 
| 99 | 
         
             
            Given an image and a description of an object or region, 
         
     | 
| 100 | 
         
             
            PIVOT iteratively searches for the point in the image that best corresponds to the description.
         
     | 
| 101 | 
         
             
            This is done through visual prompting, where instead of reasoning with text, the VLM reasons over images annotated with sampled points,
         
     | 
| 
         | 
|
| 95 | 
         
             
            [website](https://pivot-prompt.github.io/)
         
     | 
| 96 | 
         
             
            [view on huggingface](https://huggingface.co/spaces/pivot-prompt/pivot-prompt-demo/)
         
     | 
| 97 | 
         | 
| 98 | 
         
            +
            The demo below showcases a version of the PIVOT algorithm, which uses iterative visual prompts to optimize and guide the reasoning of Vision-Language-Models (VLMs).
         
     | 
| 99 | 
         
             
            Given an image and a description of an object or region, 
         
     | 
| 100 | 
         
             
            PIVOT iteratively searches for the point in the image that best corresponds to the description.
         
     | 
| 101 | 
         
             
            This is done through visual prompting, where instead of reasoning with text, the VLM reasons over images annotated with sampled points,
         
     |