Responsive smart speaker designs
Smart speakers that hear better, sound better and react more intuitively
A smart speaker is a type of speaker and voice command device with an integrated virtual assistant that offers intuitive interaction and hands-free activation with the help of one key word or wake word.
Once a novelty in households, smart speakers are more and more becoming a norm. With this rapid adoption, user expectations are rising and the frustration with devices that are not understanding or hearing commands leads to lower usage rates and growth below its full potential. Components such as MEMS microphones, touch controllers, and new technologies such as radar are key to improving the user experience in the smart speaker segment.
Infineon has long-standing expertise in sensor, connectivity, and power solutions that fulfill the consumer market requirements in terms of outstanding performance, reliability, and energy efficiency.
Taking a look at the different design considerations throughout the smart speaker architecture:
Infineon 60 GHz radar technology enables a new form of interaction between users and devices. It enables sensing presence and movement of people with high precision. The chip is very small (5x6.5mm) and has extremely low power consumption. The technology has already been integrated into a smartphone for the first time ever to enable gesture control. With products launching onto the market this year, contact us today to learn how radar can help to make your speaker context-aware and enable true speaker innovation.
WICED Wi-Fi + Bluetooth combos integrate IEEE 802.11a/b/g/n/ac WLAN and Bluetooth in a single-chip solution to enable small-form-factor IoT designs. Combo solutions are available for both 1x1 SISO with up to 433 Mbps PHY data rates and 2x2 MIMO with up to 867 Mbps PHY data rates. These solutions can be coupled with external MCUs for RTOS, along with Linux on application processors to implement a complete Wi-Fi + Bluetooth system. Our Wi-Fi + Bluetooth combos are supported on Infineon’s WICED Wi-Fi and ModusToolbox Software Development Kits (SDK) which provide code examples, tools, and development support. We support our Wi-Fi + Bluetooth customers through our global network of IoT partners using production-ready, fully-certified Wi-Fi + Bluetooth combo modules based on both Linux and RTOS based platforms.
Recommended Wi-Fi + Bluetooth combos for use in smart speakers:
XENSIV? MEMS microphones represent a new benchmark in performance, and current audio chain limitations can be overcome. A product needs reliable sensitive microphones that suit the system. For a voice system to work well, good raw data from the microphone must be fed to the voice processing algorithms running on the main system. Infineon helps you to integrate fast and easy into your design so that you will get your product to market quickly. Testing schemes can be fulfilled without any delays.
If you want to integrate a voice-controlled user interface (VUI) into a product that previously lacked an audio interface or implement an appliance design e.g. in smart speakers, TVs, and home appliances, you know it is not without roadblocks.
Infineon helps you at the start of your concept and during your development process to engineer a successful digital audio assistant design with devices such as XENSIV? MEMS microphones.
Voice user interfaces need to be designed right from the beginning: it is critical for both consumer acceptance and integration of value-added services down the road. Interaction between a human and a digital assistant must become as intuitive and natural as possible.
Design products and devices that provide users with a best-in-class voice experience. For example the IM69D130 MEMS mic, ideal for applications that require low self-noise (high SNR) and distortion, a wide dynamic range, plus a high acoustic overload point.
Discover exceptional system benefits
Depending on the precise use case, highly sensitive Infineon MEMS microphones provide a range of system benefits.
- Whisper mode: Even softly spoken or whispered voice commands are picked up in more cases in systems with the Infineon mic. This improves user convenience and maximizes interaction.
- Multi-room and distance scenarios: As users become more at ease with voice technology, they expect more of their devices and seek natural interaction from across the room or in a different room.
- Regular voice: For devices with voice control to add significant convenience to everyday life, the technology must live up to its promise. Infineon mics lay a solid foundation for a great voice user experience.
Save space and costs with Infineon design solution
Once the voice is used, the design of products is affected too, as buttons on consumer products are reduced or even eliminated.
Our solutions also allow you to reduce the number of microphones in certain applications, saving BOM and space, and reducing the overall system cost.
The PSoC? 4000 family delivers the industry’s best capacitive-sensing technology, CapSense?, to implement buttons, sliders, and proximity sensors.
We recommend using either PSoC?4000S or PSoC?4100S depending on how many IOs/touch buttons are needed. The PSoC? 4000 family is a cost-optimized, entry-level family of Arm? Cortex?-M0 and M0+ microcontrollers. The PSoC? 4100 family adds intelligent analog integration through programmable analog blocks. Programmable analog blocks include analog-to-digital converters (ADCs), digital-to-analog converters (DACs), low-power comparators, and operational amplifiers (opamps). The PSoC 4100BL includes an integrated Bluetooth Low Energy radio and subsystem.
If you require a metallic overlay for an HMI solution PSoC?4700S with noise immune inductive- and capacitive-sensing user-interface is the best choice. The PSoC? 4700 family adds advanced sensing technologies to Sense Anything including advanced capacitive sensing, inductive sensing, heart-rate sensing, and more. These advanced sensing technologies leverage specialized analog blocks in the PSoC? 4 portfolios to deliver innovative solutions to next-generation designs.
In addition, speakers need to be powered in an efficient and space-saving manner – no consumer is happy with a speaker power supply that distracts from the elegantly designed main device. Alternatively, even worse a power supply that destroys the main speaker due to overvoltage. Efficient and robust MOSFETS and drivers are therefore a key requirement and can easily be met with Infineon’s long expertise in power systems design.
At a similarly early stage, new technology is making its way to the stage – the Infineon radar sensors that have already made first appearances as speaker manufacturers are looking to increase the amount of intelligence within the speaker. For example, getting information on whether someone is around can help to turn on or off additional functionalities enabling a more intuitive interaction with the speaker, higher energy efficiency, and enhanced privacy. Learn more about presence detection.
Smart speakers are often used to listen to music but the audio quality of today’s speakers is often not satisfactory. To enable outstanding sound quality, the Infineon MERUS? provides amplifier solutions. Besides, consumers are looking for more flexibility in moving speakers, a requirement that can be met with the MERUS? amplifier solutions, which enable up to twice as long battery playback time. On top of that, designers have the option to design heatsink-free, enable filter less high performance, and thus simplify their design.
Next Generation of VUI Applications with XENSIV? MEMS microphones
Infineon's MERUS? audio solutions | new benchmark for class D amplifiers
How radar can make your home intuitively smart
Premium MEMS microphones and cutting-edge audio processing are the key elements for making voice-controlled devices truly ready for everyday situations. Features like turning off a TV across different rooms as well as the ability to whisper to Alexa to dim down the light will be key differentiators of next-generation voice-user interfaces. That is why Infineon and its voice-user interface ecosystem partners are leveraging their technological expertise to provide innovative reference platforms and ready-to-use next-generation voice-user interface solutions.
Customers looking for a reference design containing Infineon’s XENSIV? MEMS microphones can contact one of our partners listed below. This section provides an overview and introduction to our partners and their offerings, as well as a relevant distributor or contact person for purchasing support. Kindly refer to the links used in the texts, company logo, and partner signet to navigate directly to the respective website for further information.
Company | Solution | Description | Details |
Purchase |
VocalFusion Dev Kit for Amazon AVS |
The VocalFusion Dev Kit for Amazon AVS (XK-VF3510-L71-AVS) enables developers of smart home products to evaluate and prototype far-field voice interfaces using the XVF3510 voice processor with the Amazon Alexa Voice Service. The VocalFusion XVF3510 enables far-field voice capture with close-range precision. It’s a turnkey solution for developers who want to embed far-field voice control into smart TVs and set-top boxes. XMOS algorithms are purpose-designed to deliver accurate voice capture from across the room, even in noisy environments, and when content is streaming through the device. |
Sales region: Applications:
|
Contact | |
|
VocalFusion stereo-AEC voice processor for Amazon Alexa Voice Service
XK-VF3500-L33-AVS Development Kit |
This Amazon qualified development kit uses the XMOS stereo-AEC voice processor, which delivers up-close voice capture quality and processing accuracy at far-field range. The rich optimization parameters of the XVF3500 voice processor enable you to adjust noise attenuation, gain control and residual echo to deliver the best voice capture performance for your product. |
Sales region: Applications: Included IFX product: |
|
VocalFusion mono-AEC voice processor for Amazon Alexa Voice Service
XK-VF3000-L33-AVS Development Kit |
This Amazon qualified development kit uses the XMOS mono-AEC voice processor, which delivers up-close voice capture quality and processing accuracy at far-field range. The rich optimization parameters of the XVF3000 voice processor enable you to adjust noise attenuation, gain control and residual echo to deliver the best voice capture performance for your product. |
Sales region: Applications: Included IFX product: |
||
|
|
The Aaware Embedded Voice Platform? using the signal from our high performance digital XENSIV? MEMS microphones provides a complete development environment for VUI applications. It captures voice within loud interfering noise, and interfaces to popular wake-word and to automatic speech recognition (ASR) technologies on the edge from key partners such as Picovoice? and Sensory?. The combination of Aaware voice capture and these partner technologies enables voice controlled digital products with a single-chip solution that are private, secure, reliable and robust as it does not require a cloud infrastructure. The demo of the Aaware Embedded Voice Platform is powered by the Xilinx? Zynq? 7010 All Programmable SoC, enabling DSP and AI acceleration, all within the Avnet? MiniZed? processing board. |
Sales region: Applications: Included IFX product: |
Zedboard |
|
|
This integrated audio processing and sound sensing reference platform enables application development and prototyping using real-time, full-speed CEVA DSP silicon combined with the Infineon MEMS microphone IM69D130. The reference platform allows customers to barge into the market quickly with the highest-quality front-end sound processing, lowest power consumption and shortest time to market. |
Sales region: Applications: Included IFX product: |
Contact |
|
|
The Amazon qualified Far-Field Voice Solution enables fast path for prototyping and product development of Alexa-enabled devices.
Features include:
|
Sales region: Applications: Included IFX product: |
Contact |
![]() |
This is a VUI turnkey solution (module/algorithm/software), including AI noise reduction, echo cancellation and voice activation algorithms.
Key features:
|
Sales region: Applications: Included Infineon products: |
Contact | |
![]() |
ENS-2 module is a noise reduction module specially developed for communication devices, which enables users to have a noise-free conversation. With the robust and powerful deep learning-based algorithms, ENS-2 can e?ectively separate speech from background environmental noises and deliver unparalleled speech quality through an incredibly small form factor solution.
|
Sales region: Applications: Included Infineon products: |
||
|
The Edge Voice Solution is an off-line voice wake-up module, which uses Infineon’s IM69D130 MEMS mic. Aside from being highly flexible and low power (<1mA), this solution offers voice identification in multi-speaker scenario. Similarly, thanks to its sophisticated AI capabilities, customers can tailor the device to respond only to certain people. Key features: 2. Biometrics: enables IoT devices with voice print |
Sales region: Applications: Hearable & wearable devices, such as TWS earphones & type-C&USB earphones Included IFX products: |
Contact | |
|
Sugr Sense VUI Downlight Solution is an Amazon Alexa Voice Service (AVS)-certified voice-user interface turnkey solution (module, algorithm, and software). It supports Alexa Call & Message (ACM), Multi-room Music (MRM), and can pick-up far field voice up to 10 meters in a quiet home environment. Additional features include: 1. Supports multiple languages, such as Mandarin, English, etc. 2. Ready to integrate in multiple processing (e.g., ARM) & ASR platforms (e.g., Tencent) 3. Voice-user interface front-end hardware design (software/algorithm) 4. SoC control platform & system design |
Sales region: Applications: Included IFX products: |
Contact | |
|
|
The NDP9101 development platform contains the NDP101 along with components required to prototype applications such as keyword spotting, wake word processing and speaker identification consuming only 140μW. The Syntiant? NDP100? ultra low-power Neural Decision Processor? can support local voice commands, as shown in this video along with Infineon XENSIV? MEMS microphones and also comes with the Amazon Alexa Voice Service keyword model which enables a close talk built-in Alexa experience. The NDP9101 consists of two boards connected together: 1. The NDP9101 board contains the NDP101 device along with power circuitry, audio microphones, clock generation, Flash Memory, & ARM Debug port. 2. The Raspberry Pi Model 3B+ is a single-board computer. The Raspberry Pi runs Raspian Linux and the Syntiant NDP10x device control SDK software and allows C and Python interaction with the NDP101 chips. |
Sales region: Applications: Included IFX product: |
Contact |
|
|
Unisound’s solution is a standardized modular communication solution for smart home appliances. Additionally, it provides an all-in-one turnkey solution from voice-user interface (VUI) front-end signal processing to cloud service. Its exclusive Swift AI chip and Infineon's IM69D130 MEMS microphone (linear + ring integrated) can be customized according to customer needs and serves online and offline applications. This solution has 5 major benefits: 1. No need for mobile phones as the user is the control center 2. Natural voice interaction 3. Far-field voice activation and recognition 4. Low power consumption |
Sales region: Applications: Included IFX product: |
Contact |