Constantly integrating new technologies, such as the Segment Anything Model (SAM), into the CrowdAI platform; keeping CrowdAI at the forefront of computer vision
Taylor Maggos and JB Boin
Recently, the Segment Anything Model (SAM) from Meta AI hit the shelves, changing the way segmentation models can be produced; and CrowdAI is proud to incorporate this advancement into our platform to aid in building domain specific customized models.
Segmentation is a core computer vision architecture - training a model based on the exact pixels which make up an object of interest. This computer vision architecture typically requires a time consuming annotation process because a large number of carefully labeled media are needed to train a model, with the labeling work often needing to be done by subject matter experts. The caveat being, most subject matter experts don’t have the time to be meticulously labeling, nor do they have hundreds of examples of each object, defect or anomaly they are aiming to detect. This creates a huge roadblock for producing quick, off the shelf or domain specific customized segmentation models.
The SAM aims to break down this roadblock using its zero-shot generalization technique. With three methods of use, hover & click, box, or segment everything; this tool can be used to label entire images in just a few clicks. Typical segmentation labeling involves drawing a polygon around the object of interest, and depending on that object, it could take hundreds of clicks of the mouse.
For example, our work with The California National Guard, to detect and track wildfires, was completed using a segmentation model on full motion video with dozens of hours of labeling efforts. Each instance of smoke or fire in our training dataset required a hand-drawn polygon label. Using SAM, the same imagery can be segmented to detect smoke and fire in only a few seconds. See Figure 1 below for a direct comparison of annotation via polygon and annotation via SAM.
Similarly, CrowdAI built a segmentation model to detect rust in various settings. In the example below, Figure 2, the same image which previously took data labelers 125 clicks to annotate with a polygon, now only took 4 clicks using our integration with SAM to achieve the same result.
CrowdAI applied SAM to a project we completed for US Southern Command, where our goal was to detect aircraft from satellite imagery and ID the aircraft down to the specific variant.You can find more information about this work in our blog or in the Intelligence Community News post. Using our annotation training data, Figure 3, SAM was able to label aircraft in a matter of seconds compared to our human annotation efforts. This advancement will speed up the annotation process of domain specific segmentation models going forward indefinitely.
Lessening the time it takes to label, not only means cutting down on annotation expenses and allowing for more diversity in training data by moving through a larger dataset faster; but cutting down time to a trained model as well. When mission critical needs are changing on a dime, this is extremely impactful. Therefore, CrowdAI will continue to incorporate SAM technology into our platform to make advancements in our segmentation computer vision architecture; aiming to achieve semi- and/or fully automatic annotation. CrowdAI specializes in domain specific customized models, where human annotation is key, and SAM will be an impactful tool in our toolkit for quick and efficient human-in-the-loop labeling.
In a fraction of the time, entire datasets can now be annotated.
CrowdAI is eager to incorporate new advancements in AI technologies to the forefront of our Platform. We believe achieving something state-of-the-art can be a collaborative effort, so we welcome new technologies such as SAM, ChatGPT, and others still in the making to evolve our platform.