Microsoft Bing recently announced an Object Detection feature in visual search that eliminates the need to draw a box over the object.
In June, the Bing team wrote about how visual search works. A video on YouTube explains how Bing's object detection worked, allowing users to find similar products. By clicking on the visual search icon to drag the size box over the object, Bing serves up the latest products and the stores to find them at the bottom of the page.
Now by clicking on the hotspot over an object of interest, Bing automatically positions the bounding box in the right place for that object and trigger a search, showing its results in Related Products and Related Images sections of the page.
The underlying technology, Object Detection Model, finds and identifies objects in an image. It determines the category of the object that has been detected and its precise location and area occupied within the frame of the complete image.
To do this, engineers used faster R-CNN as its DNN-based object detection framework, according to Microsoft. It allows the technology to be integrated into online services such as Bing. For example, running the Faster R-CNN object detection using standard hardware took 1.5 seconds per image -- not fast enough.
So Bing tested the latest Azure cloud service with NVIDIA GPU, which accelerated the detection network by three times the speed. The ability to cache the results of the technology could decrease the latency and save 75% of the GPU (graphics processing unit) cost.
Next on the agenda is the ability to recognize faces -- celebrity faces. However, Bing says Celebrity Recognition is based on a Face Detection Model that is markedly different from the Object Detection Model, according to the company.