2D versus 3D Cameras for Occupant Monitoring – Insights from Prof. Jürgen Beyerer, Karlsruhe Institute of Technology

An optimal representation of vehicle interiors and occupants should describe all relevant aspects and their relationships.

Question: Prof. Beyerer, what is the benefit of using optical sensors in the vehicle interior?

Prof. Beyerer: Optical Sensors for different wavelengths provide information about depth, surface reflectance and temperature. This information is unique and not available in cars without cameras. 3D methods like ToF, stereo systems, or multi-camera systems with triangulation provide the shape or geometry of the scene. 2D methods, with or without active illumination like RGB, NIR, enable deductions about the scene‘s surface reflectance. And thermal imaging sensors like FIR measure surface temperatures. Such information can be helpful for a large variety of safety and comfort functions in future vehicles of private, public and commercial use.

Question: How do you assess the estimation of depth data by monocular cameras?

Prof. Beyerer: Cameras in stereo and multi-camera systems can measure depth data accurately and reliably based on triangulation, which is a geometric principle. Stereo systems, similar to human binocular vision, perform a passive triangulation to create a depth map of the scene. With more cameras the principle of triangulation achieves even higher 3D depth accuracy and robustness to occlusion.

Time-of-Flight cameras are based on measuring the travel time of light to an object and back to the sensor. Knowing the speed of light, distances to the object can be calculated. Due to the very short time differences the spatial measurement uncertainty is greater than that of triangulation-based measurements, but is still small enough for vehicle interior surveillance tasks.

Both principles, triangulation as well as time-of-flight, measure depths almost directly.

However, pure 2D images from a single monocular camera can‘t measure depth directly. However, there are methods to derive depth information from a single 2D image. These are indirect estimations which require extra computing capacity. They are relying on cues similar to how humans estimate depth with one eye – relatively accurate but prone to errors. Known object sizes in the vehicle interior allow for dimension embodiment techniques, and stereo from motion, combining stereoscopy and motion analysis principles. Moreover, end-to-end neural networks and hybrid approaches estimate depth from 2D images. All these methods involve indirect derivations and estimations, with associated error susceptibility, and their adequacy depends on the applications, the required measurement quality and the available computing capacity.

For reliable depth measures, an adequate 3D sensing setup, based on triangulation or time-of-flight, is in most cases the first choice.

Question: What role will camera arrays play in the future?

Prof. Beyerer: Camera arrays will become increasingly important. They can, depending on the setup, increase the field of view coverage area compared to single cameras. And if they cover the same area, they can create 3D data via triangulation, crucial for applications needing object distance or absolute 3D position information. Camera arrays can comprise different sensor types (e.g. NIR + ToF, NIR + Thermal) combining their strengths and alleviating their weaknesses, also merge information from various sources for a more accurate reality representation. Combining different optical sensor advantages is a key focus.

Question: What is the future of visual representation in vehicle interiors?

Prof. Beyerer: An optimal representation of vehicle interiors and occupants should describe all relevant aspects and their relationships. An object-oriented world model of the vehicle interior, with abstract representations of humans can achieve this requirement. Such a model has inevitable gaps due to sensor limitations and abstraction. Necessary features, their temporal and spatial resolution, and capture quality must be specified. Scientific institutions like Fraunhofer can contribute models and architectures to this endeavour. Such models also enable simulation and prediction of interior changes, critical for applications like airbags, predicting body movements during crashes. The goal is to develop a cyber-physical model for vehicle interiors to respond to inquiries about past, present, and future states.

Fraunhofer IOSB researchers are also working with the latest neural network-based methods to answer these questions. Large Visual Foundation Models are pivotal, and current AI models are being tested and tuned for interior applications at Fraunhofer IOSB, combining current measuring methods with generative, transformer-driven AI capabilities.

Question: What contribution offers Fraunhofer IOSB to the supply chain of interior monitoring systems?

Prof. Beyerer: As an institution for applied research, Fraunhofer IOSB focuses on systems that can reach production readiness in vehicles within 3–5 years. We focus on technologies that are ready for application – or close before. Our contribution includes testing methods, developing methods, implementing proof of-concepts, and sharing knowledge through publications, consulting and development with clients.

Our research has a long-term foundation and we are a reliable partner for our clients. Long term research projects enable us to dive deep into technologies. Our excellent laboratories are always up to date. A Level 3 automated Mercedes EQS data collection vehicle for public roads, a driving simulator with a mid-sized Audi A3 chassis, and a portable interior monitoring environment are equipped with a variety of cameras and sensors. This comes with a still growing data-base of in-cabin monitoring data, sufficient computing power and the Fraunhofer IOSB Advanced Occupant Monitoring System for research and demonstration.

Thank you, Prof. Beyerer for this insightful interview!

About Prof. Beyerer

Jürgen Beyerer is a professor at the Faculty of Computer Science at the Karlsruhe Institute of Technology. Here, is head of the Vision and Fusion Laboratory. He is also managing director of the Fraunhofer Institute for Optronics, System Technologies, and Image Exploitation IOSB. He teaches and publishes in the field of computer vision and received his PhD in Engineering with a topic in image processing habilitated in measurement technology. His research interest covers automated visual inspection and image processing and advanced signal processing, environment modeling for intelligent technical systems and human machine interaction. Prof. Dr. Beyerer supervises research in the field of visual perceptual user interfaces and driver assistant systems and consults academic and industry scientist on computer vision measurement technology, environment representation and sensor fusion.

More Information

Fraunhofer IOSB is a key player in advancing research for Occupant Monitoring Systems. We focus on recognizing all passengers within the car and classify up to 35 different activities in real time. Read more about IOSB’s Advanced Occupant Monitoring System here, if you want to find out, how you can benefit from it today!