Papers
arxiv:2110.11403

SCENIC: A JAX Library for Computer Vision Research and Beyond

Published on Oct 18, 2021
Authors:
,
,

Abstract

Scenic is an open-source JAX library with a focus on Transformer-based models for computer vision research and beyond. The goal of this toolkit is to facilitate rapid experimentation, prototyping, and research of new vision architectures and models. Scenic supports a diverse range of vision tasks (e.g., classification, segmentation, detection)and facilitates working on multi-modal problems, along with GPU/TPU support for multi-host, multi-device large-scale training. Scenic also offers optimized implementations of state-of-the-art research models spanning a wide range of modalities. Scenic has been successfully used for numerous projects and published papers and continues serving as the library of choice for quick prototyping and publication of new research ideas.

Community

Introduces Scenic: a JAX library for transformer-centric computer vision models for research and prototyping. Has ready-made implementations for ViT, DETR, MLP-Mixer, ResNet, and U-Net. Relies on JAX and Flax (for implementation), TFDS and DMVR (TensorFlow datasets and DeepMind video readers for data pipelines), OTT (optimal transport tools - Wasserstein bipartite matching toolbox) and training facilities from CLU (common loop utilities). Portable on GPU/TPU, scalable to multi-accelerator, GPU, and node training. From Google (and DeepMind).

Links: GitHub (JAX, FLAX; CLU, DMVR, OTT), also see TFD (TensorFlow datasets), Optax (gradient processing and optimization in JAX), Paxml (experimentation and parallelism), Chex (readable JAX code)

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2110.11403 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2110.11403 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2110.11403 in a Space README.md to link it from this page.

Collections including this paper 1