Google Uses Machine Learning to Take Better Pictures in Portrait Mode


Fully [H]
Apr 10, 2003
The Google AI Blog has an article explaining how Pixel 3 smartphones predict depth in Portrait Mode with Machine Learning algorithms. The Portrait Mode feature in Pixel 3 smartphones creates a depth-dependent blur of the background to draw attention to the subject. By using TensorFlow, Google software engineers were able to train a convolutional neural network to take input as PDAF pixels and then learn to predict depth. This new and improved ML-based method of depth estimation is what powers Portrait Mode on the Pixel 3.

In order to train the network, we need lots of PDAF images and corresponding high-quality depth maps. And since we want our predicted depth to be useful for Portrait Mode, we also need the training data to be similar to pictures that users take with their smartphones. To accomplish this, we built our own custom "Frankenphone" rig that contains five Pixel 3 phones, along with a Wi-Fi-based solution that allowed us to simultaneously capture pictures from all of the phones (within a tolerance of ~2 milliseconds). With this rig, we computed high-quality depth from photos by using structure from motion and multi-view stereo.
What was once accomplished with good glass and skill is now performed by specially formulated sand sifted in the cloud. Progress!