Here is How Amazon, Apple, and Google Handle Your Voice Recordings

Over the past few years, voice assistants have become very popular. In fact, this market shows incredible growth — 187% in the Q2 of 2018, according to Canalys. But the first introduction of Amazon’s Echo in 2014 raised a lot of concerns and questions about privacy. Just think, there is a device that not only helps you to perform a search on the Internet or to do small tasks, there is also a device that attentively listens to your home. In spite of that, today, it is normal for an average American to have Alexa in all the areas of their home. So, it seems that the situation around the issue of privacy subsided. And it was true until VRT NWS published a new journalistic investigation where they revealed that thousands of Amazon’s workers manually transcribe and listen to conversations people have with their voice assistant — Alexa. What’s more, it concerns not only Amazon — Google also stands in this row. 

Who is Listening to Your home Conversations? 

Amazon Echo device

Voice assistant is a very useful feature. For example, you need to quickly learn any information while driving. Everybody knows that it is not a good idea to take a phone and enter your query into a search engine. Fortunately, you can only say: “Hi, Google. Please, find me some data on …”. Sounds very convenient and safe, isn’t it? 

The problem is that not everyone is aware of the fact that all your conversations are recorded and saved on Google or Amazon servers if you have Google Assistant or Alexa. What happens next? Maybe you think that it is normal that the data then transfers to special programs that perform an analysis of all voice recordings. But the fact is that your home conversations are also auditioned by people manually. It is true that Amazon Inc. hires thousands of people across the world to listen to these recordings. They are annotated, transcribed, and only after that sent back into the computer program for further analysis. 

What is the purpose of listening to somebody’s voice recordings? Current software programs for voice analyses have a lot of gaps in understanding human speech. They can not yet accurately understand the meaning of some expressions or words. Therefore, the manual listening of your home conversations is essential to help the software to perform more accurate analyses. Based on these results, Alexa’s developers improve their product and help it respond to commands better. 

As we know, Amazon employees work around the world from Boston to Costa Rica, and from Spain to India. They are required to sign a nondisclosure agreement. The most important point is that they are prohibited from talking about the program publicly. Their working day has a standard duration — nine hours. As Bloomberg wrote, each reviewer pareses around 1 000 audio recordings per day. Also, there is a private chat where the employees can ask for help with a muddled world or amusing expression. 

Is Amazon the Only Company Who Listens to Your Recordings? 

No, Google employees are using a similar scheme to eavesdrop. Unlike Amazon, they have been denying this fact for a long time. To prove it, they even filmed a YouTube video where their employees answered the question: “Does Google review the home conversations?” And all people clearly stated that the information is stored only on the Google servers and analyzed by the software. 

But the recent journalistic investigation proved the fact that Google and Amazon hire people to help the computer programs analyze human speech. The company says that these steps are essential to support the development of Google Assistant and improve speech technologies. They confirmed that Google signs contracts only with language experts for better understanding different accents, dialects, and rare languages such as Hindi. According to their blog post published in July, these experts analyze only a small amount of data — around 0.2% of all stored voice recordings. Also, according to their privacy policy, each audio snippet does not associate with user accounts. Therefore, it is impossible to de-anonymize the person, as Google stated. But the practice shows another picture. Although the user name is replaced by the serial number, it doesn’t take much time to understand who is the owner of the defined snippet. You just have to listen carefully to what he said. People often mentioned addresses, names, and other personal information that helps to understand who they are. 

Another big corporation with its own voice technology is Apple. They also confirmed that the company subcontracted people around the world to help the software to understand human speech and, as a result, train Siri assistant. According to Apple’s security policy, the company stores its data recordings for six months. After this period of time, they save the anonymous piece of data to up to two years to train Siri. Apple confirms that they take data anonymization very seriously, and all voice records that are sent for grading are unidentified and fully randomized. To compare, Amazon’s employees can view the user’s first name, account number, and device’s serial number. 

Alexa eavesdropping picture

What information is tapped? We all think that the companies recorded the conversations only after you say the command “Ok, Google”, “Ok, Siri” or something like that. However, not so long ago, VRT NWS has discovered that Google records not only phrases after the “Ok, Google” command but also private conversations. This information was shared by one of the men hired by Google to review the voice snippets. He even showed the software they use to analyze recordings. The journalists from VRT NWS listened to more than thousands voice snippets, and what they found is that 153 of them were the private conversations that should never be recorded and stored by Google because there was no “Ok, Google” command. 

What does it mean? First of all, Google and probably other companies record the conversations between your home mates even when they do not interact with the voice assistant. Secondly, around 10% of 0.2% of all recordings that are tapped manually are private conversations. At first, it may seem that 0.2% of all voice data that Google gives to experts for manual analyses is a very small number. But do you know that around 1 billion devices can query Google Assistant?

Google said that such behavior from their contractor is a violation of the company's data security policy. This incident is under investigation, and once the company realizes who is the leaker, it will take actions. Also, Google is going to review their working conditions and policies with transcribers to prevent such situations from happening in the future. 

What Can You Do for Protecting Your Personal Information?

You can say that such actions from the corporations are illegal and violate personal rights. But the truth is that all the information is written in the company’s terms and conditions with which you need to agree before using the app. One thing that isn’t mentioned here is that your voice recordings will be reviewed by humans. If you have the voice assistant at home, here are some tips:

  • Watch what you say. Try not to mention names, surnames, addresses, and other personal information which can help other persons to reveal your identity;
  • Tune your settings and permissions. For example, in Google account, you can go to Web&App activity tab, and choose whether you want to turn off the option that allows Google to store and use your voice data or choose the auto-delete feature which removes all your personal information after 3 or 18 months;
  • “Siri and Dictation” option should be turned on. When you use Apple devices, your contact card automatically generates the random numbers — your unique identifier. It means that the data which is sent to Apple servers doesn’t associate with your Apple ID. But it works only when you turn on the feature mentioned above. 

Final Thoughts

The questions about privacy and usage of user’s personal information are raised very often. In fact, machine learning algorithms cannot be improved without these huge amounts of data. Does this mean that the disappearance of privacy is a prerequisite for technological progress? And do you have the voice assistant at home? How do you feel about such actions of these companies? Fell free to share your thoughts with us in the comment field below.