This is a collection of tools for building domain specific datasets using human domain expertise and synthetic data generation.