top of page
  • Robert Farago

META's Voice-to-Text-to-Voice AI Is Here!

But you can't have it!

You get a call from your grandson. He’s been arrested! He can’t talk now, but he needs cash for bail, stat. He tells you where to send the money. You do. Only it’s not your grandson.

The perpetrators of this scam can scrape audio from your grandbaby’s online video, feed it to an AI voice-to-text-to-voice app and convince you it’s your beloved. True story!

Just ask the Federal Trade Commission. They’ve issued advice on how to know if it’s real or it’s Memorex.

Don’t trust the voice. Call the person who supposedly contacted you and verify the story. Use a phone number you know is theirs. If you can’t reach your loved one, try to get in touch with them through another family member or their friends.

Use a phone number you know is theirs? Like a scammer can’t explain an unknown caller ID, or doesn’t know how to put a mark into panic mode. How about pre-establishing a secret code word with your loved ones? That would work!

If the FTC went down that route, it would be a tacit admission that you can no longer trust anyone on any electronic device ever. Which you can’t, of course. That’s not the kind of message the feds want to send, both commercially and politically (a distinction without a difference).

META – the $720b company that changed its name to proclaim its intention to dominate a technology no one wants – has officially announced it’s got a toehold in the white hot field of AI. Got a minute (literally)? Introducing Voicebox!

Facebook supremo Mark Feinberg narrates the video above. Or does he? Either way, you gotta love the fact that the inventor of Harvard University’s Rate-A-Babe (i.e., Facebook) inserts the words “may be” before the headline claiming “ the most versatile AI for speech generative model ever.”

Also weird: the Voicebox announcement emphasizes that it “could give natural-sounding voices to virtual assistants and non-player-characters in the metaverse.” The voices on the video are about as compelling as, uh, the metaverse.

META also touts the fact that its new software can edit out unwanted “background” audio. Conveniently enough, the canine interrupting the demo above barks between words. What audio editing software can’t fix that?

Facebook has a tendency to compete against innovation by crushing its competitors, either through buyouts or in-house imitation.

I suspect this strategy ain’t working no ‘mo; radical AI apps are hitting the market on an hourly basis. Forcing META to keep shareholders happy by revealing AI software stuck in development hell. Hang on. Is it?

The video above – posted three weeks ago – is geek to me. (The research paper as well.) If I’m not mistaken, META’s text-to-speech program is out there, somewhere.

Anyway, META’s lawyers must be mega-aware that a consumer-friendly Voicebox is Pandora’s Box. Hence this disclaimer:

There are many exciting use cases for generative speech models, but because of the potential risks of misuse, we are not making the Voicebox model or code publicly available at this time. While we believe it is important to be open with the AI community and to share our research to advance the state of the art in AI, it’s also necessary to strike the right balance between openness with responsibility.

If that isn’t enough to allay concerns – and it isn’t – the presser insists that Voicebox will be godsend to the visually impaired.

In the future, multipurpose generative AI models… could allow visually impaired people to hear written messages from friends read by AI in their voices…

The voices in the recipient’s head? I kid.

A little voice inside my head tells me you can’t stop the signal. Which isn’t META’s goal, obviously. Other than filling the metaverse void with something, other than steadying investors’ nerves, META simply wants to beat Microsoft’s VALL-E.

Voicebox isn’t the first. But it’s the best!

Is META’s entry into the text-to-voice AI field an error, especially when they could wait until the coast is clear and simply buy someone else’s work?

Time will tell. But believe me when I tell you it’s another example of META neglecting its core business to chase The Next Big Thing. Assuming it’s me telling you.

0 views0 comments


bottom of page