Let’s take a look at how the TrueDepth camera and sensors work. TrueDepth starts with a traditional 7MP front-facing “selfie” camera. It adds an infrared emitter that projects over 30,000 dots in a known pattern onto the user’s face. Those dots are then photographed by a dedicated infrared camera for analysis.
There is a proximity sensor, presumably so that the system knows when a user is close enough to activate. An ambient light sensor helps the system set output light levels. Apple also calls out a Flood Illuminator. While depth estimation using two or more standard cameras gets better every year — and is enough to do some great special effects in dual-camera phones, including the Plus models of recent iPhones — it is far from perfect.
In particular, when those systems are used to perform facial recognition, they have been criticized for being too easy to fool. Since Apple is relying on Face ID for unlocking the X and activating Apple Pay, it needs to do a lot better. It has created a more sophisticated system that uses structured light. Its depth estimation works by having an IR emitter send out 30,000 dots arranged in a regular pattern. They’re invisible to people, but not to the IR camera that reads the deformed pattern as it shows up reflected off surfaces at various depths.
This is the same type of system used by the original version of Microsoft’s Kinect, which was widely praised for its accuracy at the time. That shouldn’t be a surprise, since Apple acquired PrimeSense, which developed the structured light sensor for the original Kinect, in 2013. This type of system works well, but has typically required large, powerful emitters and sensors. It has been more suitable for the always-on Kinect, or for laptops, than for a battery-powered iPhone with a tiny area for sensors.
Face ID vs. Intel RealSense vs. Apple appears to have delivered on the mobile device promise Intel continues to make about its RealSense depth-aware cameras. Intel has shown them off on stage built into prototype mobile devices, but units in the market are still too large and power hungry to find their way into a phone.
While RealSense also has an IR emitter, it uses it to paint the entire scene, and then relies on stereo disparity captured by two IR cameras to calculate depth. The result is a module for laptops accurate enough to power facial recognition for Windows Hello, and do gesture recognition. I’m sure Apple’s TrueDepth camera will give Intel even more impetus to build a version of RealSense for phones.
Intel’s recent purchase of video processing chip startup Movidius will definitely help. Movidius has already been tapped by industry leaders like Google to provide low power vision and generalized AI processing for mobile devices, and certainly could replace the custom processor in the RealSense modules over time. Getting a depth estimate for portions of a scene is only the beginning of what’s required for Apple’s implementation of secure facial recognition and Animojis. For example, a mask could be used to hack a facial recognition system that relied solely on the shape of the face.
So Apple is using processing power to learn and recognize 50 different facial motions that are much harder to forge. They also provide the basis for making Animoji figures seem to mimic the phone’s owner. How Secure is Face ID, Early facial recognition systems got a bad name, as they could be fooled with simple photographs.
Even second-generation systems that added motion detection could be fooled by videos. Modern versions like Windows Hello go beyond that by building and recognizing 3D models of the user’s face. They can also rely on some properties of light and skin to ensure that the whatever it is looking at is skin-like.


0 Comments