2019 © Julien N. P. MARTEL | Stanford, USA | jnmartel [at] stanford [dot] edu

 

This page is under construction

The game that is played in artificial vision is to sense one or more visual quantities and to infer others we are interested in. I like to think that all visual quantities we want to infer in our problems are connected in a complex network in which they are all interdependent. The graph you see below is a very small and crude example of this network: each node is a "visual quantity" and each multi-edge is a "model" (coming from optics, electronics, mechanics) that can relate nodes together.

 

In this network, the knowledge of any of these quantities (for instance because we sense it with one of our devices, or because we have inferred it somehow) should be used in the inference of others it connects to, directly, but also indirectly, via an eventually complex chain of dependencies.

 

With this perspective in mind, I have been trying to build models that infer quantities jointly, going away from the more traditional view that artificial vision problems are solved by pipelines in which processing blocks/algorithms are chained to transform inputs in outputs. Besides stating how much joint inference should help, this perspective also legitimates the design and the use of novel vision sensors (if it needed to be legitimated). While a lot of computer vision has used light intensity/correlates of illuminance as a "starting point of inference" (i.e. as the node that is observed), this needs not be the case! Using new sensors giving us new "starting points" (eg. a DVS, that would give you a correlate of the temporal derivative of illuminance or an optic flow sensor, or a polarized imager) for the inference in this graph might be, depending on the application, much more appropriate!

Soon, I'll try to show here how the projects I am interested in fit into this view: you'll click on a node or an edge and you'll get some information about projects I have carried out in this direction!