Prodigy addresses the big remaining problem: annotation and training. The typical approach to annotation forces projects into an uncomfortable waterfall process. The experiments can’t begin until the first batch of annotations are complete, but the annotation team can’t start until they receive the annotation manuals. To produce the annotation manuals, you need to know what statistical models will be required for the features you’re trying to build. Machine learning is an inherently uncertain technology, but the waterfall annotation process relies on accurate upfront planning. The net result is a lot of wasted effort.
Prodigy solves this problem by letting data scientists conduct their own annotations, for rapid prototyping. Ideas can be tested faster than the first planning meeting could even be scheduled. We also expect Prodigy to reduce costs for larger projects, but it’s the increased agility we’re most excited about. Data science projects are said to have uneven returns, like start-ups: a minority of projects are very successful, recouping costs for a larger number of failures. If so, the most important problem is to find more winners. Prodigy helps you do that, because you get to try things much faster.