Today I have received neural network inference hardware accelerator Movidius Neural Compute Stick (USB stick). Movidius is now part of Intel (acquired by Intel in 2016).
Movidius Neural Compute Stick
Performance testing
SDK installation is smooth under Ubuntu 16.04. The SDK also contains script to fetch caffe-models from the internet. The following models are available by default: Age, AlexNet, Gender, GoogLeNet, SqueezeNet.
I have compiled examples written in C (located in ncapi/c_examples/ folder) and did some checks. All interactions with hardware are made at user-level using libusb (looks like libmvnc.so built on top of libusb) and doesn’t require any kernel-level drivers.
C examples allows us to do “image classification”. Now, let’s do some tests. Here is a gender detection process:
time ./c_examples/ncs-fullcheck -c100 ./networks/Gender/ ~/mona-lisa.jpg OpenDevice 2 succeeded Graph allocated Female (99.51%) Male (0.48%) Inference time: 237.392059 ms, total time 241.811815 ms ... Inference time: 234.574295 ms, total time 238.665898 ms Deallocate graph, rc=0 Device closed, rc=0 real 0m26.086s user 0m1.051s sys 0m0.071s
As we can see Mona Lisa’s gender is accurately detected (99.51% Female).
The detection took about 26 seconds, 100 times. This means that we have achieved about 4 fps (frames per seconds) using hardware acceleration.
“Gender” caffemodel and mona-lisa.jpg can be downloaded here.
Power consumption and energy efficiency
I was able to control power consumption during this test and the average current was 0.18Amps.
On 5V USB this gives us 5*0.18 = 0.9Watts. This means that we can achieve 4/0.9 = 4.4 fps per watt. To be more explicit, a battery with the same specs as an iphone7 (11Wh) can power this device for about 12 hours. These are good results for mobile and autonomous use-cases. Like the Joker Walker.
By the way, with the Joker main module (with Intel’s x5-z8500), using the same task (image classification), I have achieved 2 fps with power consumption of about 5W. Therefore this is equivalent to 0.4 fps per watt. This result is 11 times worse when compared to Movidius Neural Compute Stick. And a great feature is that we off-loaded the neural network tasks to “co-processor” and left the main CPU power for other important tasks.
Power consumption in suspend state
Suspend for USB devices is disabled in Linux by default. I have enabled it for the port where Movidius Neural Compute Stick connected with the following command:
echo "auto" > /sys/bus/usb/devices/1-2/power/control
after few seconds kernel log shows that device has switched into suspend state:
7,101290,1988008779700,-;usb 1-2: usb auto-suspend, wakeup 0 SUBSYSTEM=usb DEVICE=c189:83
Power consumption in suspend state is about 0.07 Amps (70 mA). This value is higher than defined in USB spec (should be less than 2.5 mA). This may cause higher battery drainage rates in mobile applications.
Teardown
Additionally for a bonus, the disassembled photo can be seen below.
On the PCB we can find the “Movidius MA2450” chipset (“Myriad 2 VPU”) which described in this product brief and in this video https://www.youtube.com/watch?v=hD3RYGJgH4A
Conclusion
This device (chipset) looks very promising for Joker Eco-System use-cases. I will do more experiments and trials.
Stay tuned for more !
Thanks for the teardown picture – good to see before I get the Dremel on mine. It baffles me why Intel chose to make such an enormous enclosure without any obvious heatsinking benefits (the sides aren’t connected to the centre via fins, for example).