News

Inicio » Automatic Speech Recognition (ASR) in payment by telephone

Automatic Speech Recognition (ASR) in payment by telephone

July 20, 2022

In contrast to flying cars, which (for the moment) are not part of our daily lives, another of the dreams of a whole generation has come true: talk to the machines. And that they understand us, of courseThe latter sounds like a truism, but it is not as obvious as it seems…

Something that seemed unthinkable just two decades ago is now possible thanks to ASR or automatic speech recognition technology. And it has a strong and growing presence in a huge range of industries and businesses; in the case of telephone payments using credit cards, it has become one of its essential features.

What is automatic speech recognition?

Simply put, automatic speech recognition (hence ASR) is a tool that enables spoken communication between a person and a computer. It is able to process a voice, recognize the meaning of words and proceed accordingly, for example by executing commands to accomplish a predefined process.

This simple explanation contains complex challenges that have required considerable technological developments. For a machine to really “understand” what a person says involves acoustic, linguistic and computational issues, and (again for the moment) its capacity, as impressive as it may seem, is limited.

How does ASR work?

Although there are various models or generations of ASRthe current speech recognition system is based on the use of artificial intelligence for machine learningbased on a corpus (or several corpora) of acoustic, phonetic and lexical nature, which by means of algorithmic treatments use data models that can be further refined.

We can point out four basic phases in the operation process of a speech recognition system.

Audio capture: the system detects voice activity, the difference of silences and noises, and converts it into ones and zeros.
Analysis and parameterization: the system differentiates units of meaning (phonetic, lexical, semantic).
Identification: in a backward process, the system identifies these units and produces a result (executes a command).
Response: the system is able to give a response based on its programming and machine learning.

What are its applications?

Voice assistants. Perhaps the most refined use of voice recognition, supported by the latest advances in PLN, Siri and company are the quintessential success story of this technology.
Command control. From controlling a computer (or a cell phone, or a wearable) simply by voice, to automatically transcribing audio to text; two forms of “communication” with different levels of difficulty.
Accessible systems for people with disabilities. The inability to perform certain tasks or manual gestures can be made up for by spoken commands, for example with connected objects in the IoT.
Telephone communication. Of course, with intelligent PBXs executing orders, providing information, automating processes, optimizing customer service, improving the operational efficiency of the companies, etc.

What are the advantages of secure card payment over the phone?

The first automated telephone answering services used the telephone’s keypad (DTMF or tone dialing system), a very reliable method that is still in use today. However, today’s users demand more agile , accessible and natural ways to perform their daily activities without human intervention.

Among these everyday activities, issuing payment orders by voice when we buy a product or want to pay a municipal tax is perhaps the one where human intervention is least desirable for security issues of our financial data. Therefore, the use of automated systems is highly recommended throughout the payment card industry.