Automated Lip Reading May Threaten Data Privacy (But Not for a While)

Think about the intricacies of lip reading, and you'll understand.

Article Card image

Believe it or not, but "professional" lip reading has been around for quite a while. As early as the 16th century, in fact. Meet a lovely little Spanish Benedictine monk by the name of Pietro Ponce, and you might have just been meeting the pioneer of professional lip reading as he was first official successful lip reading teacher. Of course the whole point of lip reading was to provide a way for the hearing impaired to understand what was being said to them as sign language was indeed a 2-way street. Who knew -- who knew that lip reading could be automated via machine learning?

That's What Is Being Discussed Today in Technology -- Automated Lip Reading

Facial recognition makes it possible, of course, as you'll need an algorithm to accurately record mouth movements and then connect those movements to the correct pronunciations of words with at least close-to-precise measurement. The idea that a computer can practically record your lips -- it's a novel idea, but not without its raised questions for data privacy and hacking potential!

Here's the thing, though: we're still  very long way from this being an accurate technology. For good reason.

Just Watch This Admittedly BAD Lip Reading of the Popular "Shallow" By Bradley Cooper and Lady Gaga

Phenomenal, isn't it? Funny, yes.... But eye-opening. Those are, after all, not. The. Lyrics. To. The. SONG! That's a given. But when you watch the lips and you hear the words, it passes as pretty convincing.

Let's say you don't know the words to the song. What then? Try this out: mute the video, and try to follow along.

If you hadn't listened to the bad lip reading, there's no telling what you'd hear.

Long story short, the art of lip reading has tremendous merit, but by itself it's far from perfect. The question is can we expect any machine to do any better? It's doubtful.

Here Is the Fundamental Inherently Technological Aspect of Lip Reading in a Nutshell

The science behind it involves recognition of a sequence of shapes, matching them to a word or a group of words. Easy enough, but it becomes a challenge when the mouth itself forms between 10 to 14 different shapes, and there's an actual term for it -- visemes. Add that to the fact that just about every human being is different with a variety of shapes of mouths (try lip reading Mick Jagger or Steven Tyler for a change), and if you ever do meet a professional lip reader, get that person's autograph.

It's hard to do. You'd have to be a certified genius.

The science also incorporates the fact that speech encompasses approximately 50 specific sounds; they're called phonemes. Because of that number, you can expect one single viseme to potentially represent several unique phonemes.

This is why it's so easy to hear a variety of words associated with the movement of lips and be completely and totally sure that it's what you're hearing. Ever listen to a song and hear a different set of lyrics even knowing that they're wrong? It happens to everyone. The same science applies here.

Imagine What It Would Take for a Machine to Automate Lip Reading....


Enter: facial recognition. That's the crux behind the technology. With this innovation, it's possible that an algorithm can take into account a large amount of permutations based on the movement of lips, extrapolate and then come up with the best possible solution that, of course, would make sense. That's the key.

A machine does this through measurement of height and width of the lips, detailing features of the ellipse, teeth, redness (the tongue that shows when a person's speaking) and contouring. Through learning, a machine can determine with pretty decent accuracy, but still far from perfect --

Case in point, mustaches, beards, and even gender: they can affect accuracy. Findings have proven that it's easier for a machine to read the lips of a woman than the lips of a man. And truly not everyone is "readable." Some are hardly very expressive with their lips, making it very difficult for even a computer to determine the correct language.

One scientist by the name of Hassanat actually developed one automated system of speech recognition, achieving a 76% success rate. Not too bad. Of course, the experimentation was under very controlled conditions, not accounting for unforeseen variables, but we're not asking for much.

Can it ever possibly be so good that it could be close to perfect? Possibly.

What a Computer Will Need to Do to Accurately Recognize Speech Visually

The best lip readers succeed well via other factors, such as context, body movement and a rich understanding of the language beyond that of correct grammar. When a computer accomplishes that -- which in this day and age of technological innovation is a possibility -- then we might be seeing a lip-reading robot very soon.

Is that necessarily a great thing to behold, though? Will we be so readable from afar that nothing ever will be private in our lives? A system would only have to zoom in from a distance and record our lips, have an algorithm determine what was just said, and guess what: passwords are never private. Scary thought. We are, however, pretty sure of this: by the time hackers and ID thieves figure that one out with a technology that's meant to benefit us instead of hurt us, we'll most likely have measures in place to prevent it.

That being said.... Time for some seagulls:


Facial Recognition
lip reading
Technology News
Technology Trends
Machine Learning

Related Article