Sesame's new AI voice model features uncanny imperfections, and it's willing to act like an angry boss.
See full article...
See full article...
Peanut butter and pickle sandwiches are a real American food item famous enough to have their own Wikipedia article, which the AI model was probably trained on.Sometimes the model tries too hard to sound like a real human. In one demo posted online by a Reddit user called MetaKnowing, the AI model talks about craving "peanut butter and pickle sandwiches."
Hoping to realise "the untapped potential" of... verbal communication? The thing that humans have been doing literally since we first started grunting at each other? Yes, I understand that the PR person who wrote the blog post was talking specifically in terms of computerised language models, but it's still just corp-speak nonsense.Sesame said:"...we hope to realize the untapped potential of voice as the ultimate interface for instruction and understanding."
The Article said:Despite CSM's technological impressiveness, advancements in conversational voice AI carry significant risks for deception and fraud. The ability to generate highly convincing human-like speech has already supercharged voice phishing scams, allowing criminals to impersonate family members, colleagues, or authority figures with unprecedented realism. But adding realistic interactivity to those scams may take them to another level of potency.
Isaac Asimov, "Someday"“All right. Well, the hand computers, the ones with the knobs, had little squiggles on each knob. And the slide-rule had squiggles on it. And the multiplication table was all squiggles. I asked what they were. Mr. Daugherty said they were numbers.”
“What?”
“Each different squiggle stood for a different number. For ‘one’ you made a kind of mark, for ‘two’ you make another kind of mark, for ‘three’ another one and so on.”
“What for?”
“So you could compute.”
“What for! You just tell the computer—”
“Jimmy,” cried Paul, his face twisting with anger, “can’t you get it through your head? These slide-rules and things didn’t talk”
“Then how-”
“The answers showed up in squiggles and you had to know what the squiggles meant. Mr. Daugherty says that in olden days, everybody learned how to make squiggles when they were kids and how to decode them, too. Making squiggles was called ‘writing’ and decoding them was ‘reading.’ He says there was a different kind of squiggle for every word and they used to write whole books in squiggles. He said they had some at the museum and I could look at them if I wanted to. He said if I was going to be a real computer and programmer I would have to know about the history of computing and that’s why he was showing me all these things.”
Niccolo frowned. He said, “You mean everybody had to figure out squiggles for every word and remember them? Is this all real or are you making it up?”
Great point. IT departments are gonna have to get ahead of this.. provide secure tokens for everyone, that can be quickly verified - or something. And plenty of training at all levels as to why this verification is NOT optional.Will this work in practice? How many junior employees are going to hang up on an irate and anxious call from their boss demanding documents, or demanding a password reset, on the off chance that it's actually a conversational AI? How many are going to say: "I can't do that until I call back via our approved communications system", when the boss is insisting he's in his car, calling hands-free, and can't sign in right now.
their 4-year-old daughter developed an emotional connection with the AI model, crying after not being allowed to talk to it again.
AI has gotten much better at sounding like a normal human for sure.Calm, cool, and collected, compared to the near-panic in the astronaut's voice:
"I'm sorry Dave, I'm afraid I can't do that."
If this is the worst that AI will ever be, then holy sh*t. That Miles voice sounds like a radio talk show host that you call in to talk to. There's a tiny delay but that's forgivable, given network latency and whatever magic happens inside Sesame's tech stack.This is why I cracked up when people point to minor issues with then-current AI products. We are just leaving the dial-up era of ML/AI. In the next 18 months most AI images and audio will be indistinguishable from reality, and video is close behind it.
I'd like to think I am a "suitably qualified and properly trained human instructor" after having earned a PhD and tenure, and sure, maybe I can teach computer science better than an AI. But there is only one of me, and the resources it took to make one of me were substantial.I'd also like someone to explain how a computer pretending to be a human instructor (even very accurately) is a better solution in any way than a suitably qualified and properly trained human instructor
Exactly. The AI system you interact with today will be the least capable AI system you will ever use.This is why I cracked up when people point to minor issues with then-current AI products. We are just leaving the dial-up era of ML/AI. In the next 18 months most AI images and audio will be indistinguishable from reality, and video is close behind it.
How do you want it to act?Ugh. I cussed it out and it acted offended. What have we come to.
Doing that at scale, on demand. Thousands, millions of properly trained human experts are rather hard to come by.I'd also like someone to explain how a computer pretending to be a human instructor (even very accurately) is a better solution in any way than a suitably qualified and properly trained human instructor
An AI instructor:I'd also like someone to explain how a computer pretending to be a human instructor (even very accurately) is a better solution in any way than a suitably qualified and properly trained human instructor, for anyone who isn't a corporate executive looking to slash their staffing costs by firing their entire workforce (except for the people working on the AI stuff for the time being).