Create README.md

1b5f14d about 1 year ago

4.73 kB

	---
	license: apache-2.0
	inference: false
	datasets:
	- autoflow
	---

	# Perceiver IO optical flow model

	This model is a Perceiver IO optical flow model pretrained on [AutoFlow](https://autoflow-google.github.io/).
	It is weight-equivalent to the [deepmind/optical-flow-perceiver](https://huggingface.co/deepmind/optical-flow-perceiver)
	model but based on implementation classes of the [perceiver-io](https://github.com/krasserm/perceiver-io) library. It
	can be created from the `deepmind/optical-flow-perceiver` model with a library-specific [conversion utility](#model-conversion).
	Both models generate equal output for the same input.

	Content of the `deepmind/optical-flow-perceiver` [model card](https://huggingface.co/deepmind/optical-flow-perceiver)
	also applies to this model except [usage examples](#usage-examples). Refer to the linked card for further model and
	training details.

	## Model description

	The model is specified in Appendix H (Table 16) of the [Perceiver IO paper](https://arxiv.org/abs/2107.14795).

	## Intended use and limitations

	The model can be used to predict the optical flow between a pair of images.

	## Usage examples

	To use this model you first need to [install](https://github.com/krasserm/perceiver-io/blob/main/README.md#installation)
	the `perceiver-io` library with extension `vision`.

	```shell
	pip install perceiver-io[vision]
	```

	Then the model can be used with PyTorch.

	### Image pair

	The following example uses this image pair as input

	<img src="https://martin-krasser.com/perceiver/flow/frame_0047.png" alt="image-1" width="500"/>
	<img src="https://martin-krasser.com/perceiver/flow/frame_0048.png" alt="image-2" width="500"/>

	and renders their optical flow as HSV representation (`render=True`):

	```python
	import requests
	from PIL import Image
	from transformers import pipeline
	from perceiver.model.vision import optical_flow # register optical flow pipeline

	frame_1 = Image.open(requests.get("https://martin-krasser.com/perceiver/flow/frame_0047.png", stream=True).raw)
	frame_2 = Image.open(requests.get("https://martin-krasser.com/perceiver/flow/frame_0048.png", stream=True).raw)

	optical_flow_pipeline = pipeline("optical-flow", model="krasserm/perceiver-io-optical-flow", device="cuda:0")
	rendered_optical_flow = optical_flow_pipeline((frame_1, frame_2), render=True)

	Image.fromarray(rendered_optical_flow).save("optical_flow.png")
	```

	The [rendered optical flow](https://martin-krasser.com/perceiver/flow/optical_flow.png) is

	<img src="https://martin-krasser.com/perceiver/flow/optical_flow.png" alt="image-2" width="500"/>

	### Video

	To compute the optical flow of an entire video, the `optical-flow` pipeline can be used in combination with functions
	from `video_utils`. The following code samples all frames from a [video snippet](https://martin-krasser.com/perceiver/flow/sintel_clip_cave_dragon_fight.mp4)
	taken from the [Sintel animated short movie](https://durian.blender.org/), computes the optical flow per consecutive
	frame pair and writes the rendered results back to an output video file.

	```python
	from transformers import pipeline
	from perceiver.data.vision import video_utils
	from perceiver.model.vision import optical_flow # register optical flow pipeline

	optical_flow_pipeline = pipeline("optical-flow", model="krasserm/perceiver-io-optical-flow", device="cuda:0")

	# sample consecutive video frame pairs
	frame_pairs = video_utils.read_video_frame_pairs("sintel_clip_cave_dragon_fight.mp4")

	# create and render optical flow for all frame pairs
	optical_flows = optical_flow_pipeline(frame_pairs, render=True, device="cuda:0")

	# create video with rendered optical flows
	video_utils.write_video("sintel_clip_cave_dragon_fight_output.mp4", optical_flows, fps=24)
	```

	A side-by-side comparison of the input and output video is:

	![optical-flow-sbs](https://martin-krasser.com/perceiver/flow/sintel_clip_cave_dragon_fight_side_by_side_horizontal.gif)

	## Model conversion

	The `krasserm/perceiver-io-optical-flow` model has been created from the source `deepmind/optical-flow-perceiver` model
	with:

	```python
	from perceiver.model.vision.optical_flow import convert_model

	convert_model(
	save_dir="krasserm/perceiver-io-optical-flow",
	source_repo_id="deepmind/optical-flow-perceiver",
	push_to_hub=True,
	)
	```

	## Citation

	```bibtex
	@article{jaegle2021perceiver,
	title={Perceiver IO: A General Architecture for Structured Inputs \& Outputs},
	author={Jaegle, Andrew and Borgeaud, Sebastian and Alayrac, Jean-Baptiste and Doersch, Carl and Ionescu, Catalin and Ding, David and Koppula, Skanda and Zoran, Daniel and Brock, Andrew and Shelhamer, Evan and others},
	journal={arXiv preprint arXiv:2107.14795},
	year={2021}
	}
	```