Security
Despite the many benefits of speech and voice recognition, there is some risk involved. Ignoring all the conspiracy theories, we can most likely agree that it is at least possible someone could access or hack into a virtual home assistant like Google Home or Alexa and listen in on people's private conversations. This was confirmed in August of 2017 by British security researcher Mark Barnes, who was able to install malware onto an Amazon Echo purchased before 2017 that turned the microphone into a wiretap that could listen to any conversation. However, the security measures of all the major corporations with virtual assistant technology make this data encrypted and hacking unlikely.
An article by USA Today proposes some suggestions to protect the data that is collected through speech recognition, as well as instructions on how to disable the microphone. It points out that most virtual assistants that use speech recognition record interactions with the user that can be replayed. It further claims that, "According to Amazon, there is also a fraction of a second of audio before the wake word that is stored along with each recording. That fraction of a second gets saved along with your main command, and the recording ends after the command has been processed." This means if there was private information exchanged right before a "wake word" or "activation phrase" like "Siri," "OK Google" or "Alexa" was said, that information may be stored on the company's server. Luckily, many companies including Google offer a method of deleting recordings of voice commands, and some assistants like Siri don't catalog these interactions at all. There are also companies like Nuance that claim to make privacy a top priority. According to Wired, queries saved to servers are used to further improve these technologies for future updates and releases.
An article by USA Today proposes some suggestions to protect the data that is collected through speech recognition, as well as instructions on how to disable the microphone. It points out that most virtual assistants that use speech recognition record interactions with the user that can be replayed. It further claims that, "According to Amazon, there is also a fraction of a second of audio before the wake word that is stored along with each recording. That fraction of a second gets saved along with your main command, and the recording ends after the command has been processed." This means if there was private information exchanged right before a "wake word" or "activation phrase" like "Siri," "OK Google" or "Alexa" was said, that information may be stored on the company's server. Luckily, many companies including Google offer a method of deleting recordings of voice commands, and some assistants like Siri don't catalog these interactions at all. There are also companies like Nuance that claim to make privacy a top priority. According to Wired, queries saved to servers are used to further improve these technologies for future updates and releases.
Bias
As Lady Ada Lovelace famously said, "The Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform." This is true for speech recognition technology as well. Speech recognition technology is constantly improving, but it is only as good as its training data. If the software is not properly trained with different accents and other speech patterns, it will only perform well with specific demographics and is biased.
This is a huge problem, as many users including people I know get understandably upset and frustrated when a speech recognition software does not understand what is said. While some areas such as video games have solutions to this, like VoiceAttack, other areas don't have similar solutions yet. One area where this type of bias plays a detrimental role is in virtual assistants, such as Siri, Alexa, Google and others, when interpreting accents from around the world. This video compares the accuracy of some of the most popular virtual assistants on the market when interpreting accents, and shows that although we are getting closer, there is still room for growth. The following video shows a scenario in which an elevator only has speech recognition interaction with the user, and no other method of interaction, and the problems that arise. |
|
Jobs
|
Some worry that technological advances in artificial intelligence including speech recognition are destroying jobs while also not creating new ones, leading to mass unemployment. After all, artificial intelligence doesn't need much supervision. This documentary shows how AT&T's automated operating system in 1991 impacted operators, and presents a collection of their thoughts and feelings about the transition. However, some optimists claim that there will be an increase in jobs that require social interaction, such as psychiatry, social work, bartending, and hospitality.
|