Two investigations provide more ‘human’ vision to driverless cars | Technology
Autonomous driving vehicles depend on a sense as crucial for human beings as vision. It is not just about the machine being able to see, which it already does, but about looking, analyzing, discriminating and proceeding in milliseconds. The challenge is to achieve this characteristic of people in just the right time to make the necessary decision. For a machine, for example, seeing a tree next to the track is easy. The difficult thing is knowing that it is not an object that is going to move or get in the way, and the opposite if it is a pedestrian. The scientific magazine Nature publishes this Wednesday two advances in this sense: a processor to respond quickly to an event with minimal information and a new system (algorithm) to improve the precision of machine vision with lower latency (response time).
These investigations, which are fundamental for the development of autonomous driving vehicles or robotics, already have advanced developments at the Institute of Microelectronics (Imse) in the Andalusian capital, of the Higher Council for Scientific Research (CSIC) and the University of Seville . . Multinationals such as Samsung and Sony already use the patents sold by the company Prophesee.
The two works he publishes Nature They are innovations on these systems based on foveation, the human mechanism that allows maximizing resolution in the area where the vision is focused, while the decrease in the peripheral vision areas is not relevant. In this way, the amount of information is reduced, but the visual recognition capacity of the data essential for decision making in milliseconds is maintained.
The key is accurate interpretation of the scene and rapid motion detection to allow immediate reactions. Conventional cameras can capture the image of a frame and reproduce it at very high resolution, but all that information has to be processed and discriminated, which involves time and resource expenditure incompatible with the instantaneous decisions required by autonomous driving or advanced robotics.
One of the advances is signed by Daniel Gehrig, researcher at the University of Pennsylvania (USA), and Davide Scaramuzza, professor of robotics at the University of Zurich (Switzerland). Both have addressed the difficulty of decision-making with high-resolution color images. These need a large bandwidth to be processed with the necessary fluidity and reduce this high capacity at the cost of greater latency, more time to respond. The alternative is to use an event camera, which processes continuous streams of pulses, but at the sacrifice of precision.
To address these limitations, the authors have developed a hybrid system that achieves effective object detection with minimal latency. The algorithm combines information from two cameras: one that slows down the color frame rate to reduce the bandwidth required, and another event camera that compensates for that latency loss, ensuring that fast-moving objects, such as pedestrians and cars, . , can be detected. “The results pave the way toward efficient and accurate object detection, especially in extreme scenarios,” the researchers say.
“It’s a breakthrough. Current driver assistance systems such as those from MobileEye—which are built into more than 140 million cars worldwide—work with standard cameras that take 30 frames per second, or one image every 33 milliseconds. Additionally, a minimum of three frames is required to reliably detect a pedestrian or car. This brings the total time to initiate the braking maneuver to 100 milliseconds. Our system allows us to reduce this time to below one millisecond without the need to use a high-speed camera, which would entail an enormous computational cost,” explains Scaramuzza.
Current systems increase the total time to initiate the braking maneuver to 100 milliseconds. Our algorithm allows us to reduce this time to below one millisecond without the need to use a high-speed camera.
Davide Scaramuzza, professor of robotics at the University of Zurich (Switzerland)
The technology has been “transferred to a top-level company,” according to the researcher. “If approved, it can typically take many years from proof of concept to impact to final implementation,” he adds.
For his part, Luping Shi, director of the Center for Brain-Inspired Computing Research (CBICR) at Tsinghua University (China), has developed the Tianmouc chip (processor) with his team. Inspired by the way the human visual system works, it combines fast and imprecise perceptions, such as those of human peripheral vision, with higher resolution ones that are slower to process.
In this way, the chip also functions as an event camera, which instead of full frames processes continuous flows of electrical impulses (events or Peaks) recorded by each photosensor when it detects a sufficient change in light. “Tianmouc has an array of hybrid pixels: some with low precision, but fast detection, based on events to allow quick responses to changes without the need for too much detail and others with slow processing to produce an accurate visualization of the scene” , explains the researcher. The chip has been tested in scenarios such as a dark tunnel suddenly illuminated by a dazzling light or on a road crossed by a pedestrian.
Bernabé Linares, research professor at Imse and head of the highest resolution commercial event camera, highlights that Scaramuzza uses drones to collect images conventionally and with event cameras. “The advance is the algorithm used to recognize objects and the result is interesting,” he highlights.
Imse works mainly with processors, and points out that algorithmic developments such as those at the University of Zurich are essential as a complement to chips and for robotic applications. As they are very compact technologies, they require lightweight calculation systems that consume little energy. “For drones it is an important development. This type of event cameras are very good for them,” he highlights.
Luping Shi’s work is closer to the developments of the Imse Neuromorphic Systems Group. In this case it is a hybrid processor. “The pixels alternate and spatial differences are calculated. Stores the light from one image to the next and calculates the change. If there is no modification, the difference is zero. It carries data very infrequently from a fairly comfortable sensor,” explains Linares.
Although the uses highlighted by Nature If they are aimed at autonomous driving, advances in vision have great relevance in robotics, which also requires the ability to discriminate information to make decisions at high speed. This is the case of industrial automation processes. “But car manufacturers are very interested because they look for all kinds of developments, since this is safer and they can get the best out of each technology,” explains Linares, who highlights that Renault is one of Prophesee’s investors.
You can follow EL PAÍS Technology is Facebook and x or sign up here to receive our weekly newsletter.