Machine vision systems have many applications, including self-driving cars, intelligent manufacturing, robotic surgery and biomedical imaging, among many others. Most of these machine vision systems use lens-based cameras, and after an image or video is captured, typically with a few megapixels per frame, a digital processor is used to perform machine-learning tasks, such as object classification and scene segmentation. Such a traditional machine vision architecture suffers from several drawbacks. First, the large amount of digital information makes it hard to achieve image/video analysis at high speed, especially using mobile and battery-powered devices. In addition, the captured images usually contain redundant information, which overwhelms the digital processor with a high computational burden, creating inefficiencies in terms of power and memory requirements. Moreover, beyond the visible wavelengths of light, fabricating high-pixel-count image sensors, such as what we have in our m...