Voice Control

Voice and Gesture Control for the Digitized Industry

© Fraunhofer IAIS

In industry, new forms of interaction with technical devices and user interfaces through gestures and speech have a great potential for making processes more efficient and intuitive. Speech assistants can support and simplify existing processes, for example in visual inspection during quality assurance. At the Machine Learning Research Center, experts from the Fraunhofer CCIT are working on combining speech recognition and gesture control in a multimodal speech assistant that can be used for defect identification and marking. To demonstrate the technology, the research team extended a surface-inspection system developed by the Fraunhofer IOSB. The existing system was supplemented with a speech dialog system by the Fraunhofer IAIS for voice control and a microphone device by the Fraunhofer IIS for targeted speech recording in loud environments.

Quality assurance is an important part of today's production processes. For surface inspections, the automatic visual inspection is often complex, laborious and cost-intensive to perform and cannot be carried out in real time for the entire surface. Corners and angles are just some examples that still cannot be fully automatically inspected today. Many companies therefore continue to rely on checking by experienced personnel, who feel and visually inspect the surfaces and document defects for reworking and for statistical purposes.

Only a few seconds per component are often all that remain for quality assurance. With the amount of documentation work high depending on the software used, such situations result in inaccurate or even incomplete documentation of existing defects. In doing so, accumulations cannot be recognized and rectified early on, thus increasing the probability of an undetected defect as well as the rectification costs. Furthermore, imprecise documentation of the defects makes it more difficult to locate defects during rectification and thwarts rapid and complete repairs.

Multimodal dialog assistant

The multimodal dialog assistant (MuDA), developed by the Fraunhofer IOSB and modified in cooperation with the Fraunhofer IAIS, allows complete digital defect documentation during production, from marking to repair, in a very short time. Users can choose between intuitive pointing gestures and a laser pointer as the input methods and quickly, precisely and intuitively mark defective areas on a component. The pointing gestures and thus the location of the defect are recorded by a camera unit attached above the component to be checked. A projector shows the operator the documented defect directly via a marking projected onto the component. Metadata such as the type of a marked defect are entered via a projected menu or via a speech dialog system. In the rectification phase, this defect marking can be precisely reproduced by digital display at another station and projected onto the component. In this way, the defect is quickly found again during rectification. Completion of the repair can also be confirmed directly on the component without much work and noted in the system.