Spaces:
Running
Running
# Image Compression with Neural Networks | |
This is a [TensorFlow](http://www.tensorflow.org/) model for compressing and | |
decompressing images using an already trained Residual GRU model as descibed | |
in [Full Resolution Image Compression with Recurrent Neural Networks](https://arxiv.org/abs/1608.05148). Please consult the paper for more details | |
on the architecture and compression results. | |
This code will allow you to perform the lossy compression on an model | |
already trained on compression. This code doesn't not currently contain the | |
Entropy Coding portions of our paper. | |
## Prerequisites | |
The only software requirements for running the encoder and decoder is having | |
Tensorflow installed. You will also need to [download](http://download.tensorflow.org/models/compression_residual_gru-2016-08-23.tar.gz) | |
and extract the model residual_gru.pb. | |
If you want to generate the perceptual similarity under MS-SSIM, you will also | |
need to [Install SciPy](https://www.scipy.org/install.html). | |
## Encoding | |
The Residual GRU network is fully convolutional, but requires the images | |
height and width in pixels by a multiple of 32. There is an image in this folder | |
called example.png that is 768x1024 if one is needed for testing. We also | |
rely on TensorFlow's built in decoding ops, which support only PNG and JPEG at | |
time of release. | |
To encode an image, simply run the following command: | |
`python encoder.py --input_image=/your/image/here.png | |
--output_codes=output_codes.npz --iteration=15 | |
--model=/path/to/model/residual_gru.pb | |
` | |
The iteration parameter specifies the lossy-quality to target for compression. | |
The quality can be [0-15], where 0 corresponds to a target of 1/8 (bits per | |
pixel) bpp and every increment results in an additional 1/8 bpp. | |
| Iteration | BPP | Compression Ratio | | |
|---: |---: |---: | | |
|0 | 0.125 | 192:1| | |
|1 | 0.250 | 96:1| | |
|2 | 0.375 | 64:1| | |
|3 | 0.500 | 48:1| | |
|4 | 0.625 | 38.4:1| | |
|5 | 0.750 | 32:1| | |
|6 | 0.875 | 27.4:1| | |
|7 | 1.000 | 24:1| | |
|8 | 1.125 | 21.3:1| | |
|9 | 1.250 | 19.2:1| | |
|10 | 1.375 | 17.4:1| | |
|11 | 1.500 | 16:1| | |
|12 | 1.625 | 14.7:1| | |
|13 | 1.750 | 13.7:1| | |
|14 | 1.875 | 12.8:1| | |
|15 | 2.000 | 12:1| | |
The output_codes file contains the numpy shape and a flattened, bit-packed | |
array of the codes. These can be inspected in python by using numpy.load(). | |
## Decoding | |
After generating codes for an image, the lossy reconstructions for that image | |
can be done as follows: | |
`python decoder.py --input_codes=codes.npz --output_directory=/tmp/decoded/ | |
--model=residual_gru.pb` | |
The output_directory will contain images decoded at each quality level. | |
## Comparing Similarity | |
One of our primary metrics for comparing how similar two images are | |
is MS-SSIM. | |
To generate these metrics on your images you can run: | |
`python msssim.py --original_image=/path/to/your/image.png | |
--compared_image=/tmp/decoded/image_15.png` | |
## Results | |
CSV results containing the post-entropy bitrates and MS-SSIM over Kodak can | |
are available for reference. Each row of the CSV represents each of the Kodak | |
images in their dataset number (1-24). Each column of the CSV represents each | |
iteration of the model (1-16). | |
[Post Entropy Bitrates](https://storage.googleapis.com/compression-ml/residual_gru_results/bitrate.csv) | |
[MS-SSIM](https://storage.googleapis.com/compression-ml/residual_gru_results/msssim.csv) | |
## FAQ | |
#### How do I train my own compression network? | |
We currently don't provide the code to build and train a compression | |
graph from scratch. | |
#### I get an InvalidArgumentError: Incompatible shapes. | |
This is usually due to the fact that our network only supports images that are | |
both height and width divisible by 32 pixel. Try padding your images to 32 | |
pixel boundaries. | |
## Contact Info | |
Model repository maintained by Nick Johnston ([nmjohn](https://github.com/nmjohn)). | |