Every now and then, something will shake up the very foundation of the audiobook world. Recently, it has been the news from a publisher that they are developing AI voices for audiobooks. Understandably, audiobook narrators were shocked and appalled because their very livelihood is being threatened by a company many have worked with.
But what do audiobook listeners and fans think?
I asked in my Facebook group and got a lot of very interesting feedback. Below, you will find the arguments for and against AI audiobook voices from the point of view of avid audiobook fans. But first…
**The marked links and book covers on this page are affiliate links. If you use them to purchase something, I earn a fee at no additional cost for you. Disclosure**
What are AI Audiobook Voices?
Well, it is basically a (more or less) advanced form of text-to-speech. Instead of having an app read an ebook to you, you are supposed to buy these audiobooks individually though. Their big advantage for publishers is that they save the cost on an actual human voice actor. And AI audiobooks are faster to produce. There is still editing and post-production happening though. And the AI voices are based on human voice actors. So, the cost isn’t actually that much lower. But we will get back to this topic later.
AI voices should be able to read the emotions in a scene from the context and adjust their narration accordingly. That’s where artificial intelligence comes in. The goal of AI voice developers is to have them be indistinguishable from human narrators and they claim to already be there.
Concerns in regard to AI Voices for Audiobooks
Many common points of criticism we audiobook fans have when it comes to certain human narrators (e.g. because they aren’t experienced yet or don’t have enough training) translate into concerns about AI voices for audiobooks. Is it easy to keep different characters apart during dialogue? Can the listener recognize what a character is thinking versus what they are saying out loud? Are inflection and pronunciation right? Does the narrator transport the emotions in a scene and draw us into the story?
Many people who only consume their books in audio format (be it because of a disability or circumstance) have experience with text-to-speech apps. Sometimes an ebook doesn’t have an audio version and if we really want to read that story, text-to-speech might do in a pinch.
AI voices for audiobooks fall on a scale between experienced voice actors on the one side and text-to-speech on the other. When you think of a Fiction novel with lots of dialogue and many emotions, a narrator uses a range of inflections and different voices to transport what is happening in the story. That makes it easier for listeners to follow, they can tell right away who is talking or thinking something and how the characters are feeling. Text-to-speech can’t do that at all.
And AI voices are not there yet either (I listened to the DeepZen samples and… yeah… this might “astound” people who never listen to audiobooks, but to audio fans who listen to narrators every day these samples just sound robotic). While the AI can recognize emotions from the context and change pitch, it isn’t remotely comparable to what human voice actors do (just for fun, check out these samples by Joel Leslie). AI falls flat when listeners expect to be pulled into a story.
If you have ever listened to an audiobook read by a narrator who just reads the words and doesn’t do voices, you might know how difficult it is to follow a Fiction book or a memoir if there isn’t any acting and if the reader doesn’t bring out the emotions in a story!
However, AI voices might work out decently for Nonfiction and textbooks, which don’t require a lot of emoting. But why would we pay for that instead of just using a text-to-speech app in the first place?
Related article: Audiobooks vs Reading Print/Ebooks
Anger about AI Audiobook Voices
Many audiobook fans are “narrator-motivated”, meaning, they would happily buy an audiobook from an author they don’t know, or in a genre they don’t usually favor, as long as it is narrated by their favorite. “I would listen to him/her read the phonebook!” is a popular line. New authors trying to get into the audiobook business can get a strong start if they can hire a popular voice actor because the talent will bring their own fanbase and avenues of promotion.
Every audiobook listener I know, even the casual ones, has a favorite narrator. And that name will always be a draw for them even if they never go to Twitter, Facebook, Insta, and just browse Audible a bit every other month. There is a reason why Audible gives you suggestions for other titles read by the narrator you just listened to!
And the die-hard audiobook narrator fans are actively angry about AI voices for audiobooks! They fear that the market will be smaller for their favorite human voices. And they feel it is simply shitty to do this at all because the narrator is so important for our audiobook experience.
For new voice actors or lesser-known ones doing niche titles, this is an even bigger problem. They don’t have much of a fanbase and they still have to be more expensive than AI voices. It will be hard for them to compete and their livelihood is even more threatened by this development.
For those of us listening to audiobooks every day, loving our favorite narrators dearly, and basically living and breathing audiobooks, the whole discussion about AI voices for audiobooks can feel a bit like a slap in the face because an audiobook is a very intimate experience. You have a person in your ear, telling you a story, eliciting emotions in you. The idea that this very human, very personal side of it gets ignored and we would have to listen to a robot voice feels foreign and worrying. It devalues our beloved voice actors as well as our own aural experience.
And that means some audiobook fans refuse to ever pay for an AI-generated audiobook on principle ground, no matter how good the AI voices get.
I can also see a vicious circle in the making here: Indie authors wanting to bring out their first audio, scared of the investment, choosing AI because it is a bit cheaper, then their audiobook won’t sell and they won’t bring out their books in audio format again “because it isn’t worth it”. The audiobook market is very distinct from the ebook market and needs its own marketing. Audiobooks as an afterthought hardly work. And they definitely won’t work if new authors try to save on the voice talent!
Related article: How do Authors get paid for Audiobooks from Libraries and Stores?
Benefits of AI Voices for Audiobooks
Money is important. Not just for authors trying to live off of their work (or at least not losing tons of money on it), but also for listeners – especially those on disability aid, those without library access, and those who can’t for whatever reason fall back on ebooks or second-hand print books. So, if AI-voiced audiobooks were to cut the cost considerably, and this would translate into noticeably lower costs for listeners, then that is a good thing.
If AI voices were so cheap, that indie authors who don’t publish in audio format would offer their titles as audiobooks, that would be a great thing as well, especially for readers with disabilities.
However, as I said earlier, it is dangerous to assume that an audiobook should sell well just because the format is popular. Most often with indie books, it is the narrator who will first draw listeners to an author who is new to the audiobook market. If an author decides to go with a synthesized voice, they have to be aware that their target audience will be much smaller than it would be with a popular voice talent. They make their story more accessible though, which is appreciated!
Audible and ACX
Until recently, a lot of the AI audiobook discussion has been theoretical. DeepZen’s samples are bad. And Audible, the biggest audiobook marketplace in the world, doesn’t allow AI-narrated audiobooks.
So, what’s the fuss anyway, right?
Well, recently, it turned out that several AI audiobooks have been up for sale on Audible. The narration came from another company with much more advanced AI voices. Since Audible doesn’t allow non-human narration, the audiobook pages mentioned nowhere that it’s not a real narrator. On top of that, the voice had a name very similar to that of a human voice actor. In fact, if you clicked on the narrator link on the Audible page, you would get all the results from the real person!
So, in reality, publishers used the fact that Audible doesn’t allow AI voices to deliberately mislead customers. The narration was good enough for these audiobooks to pass ACX quality control and were then offered at full price on Audible. Instead of passing on the savings by using a synthesized voice, these publishers tried to earn more, at the expense of the customers who would get a sub-par narration, the authors who would see fewer sales on their audiobook (who would likely only be paid at all once a threshold is reached), and the voice actors who lost out on these jobs.
Several narrators and I kept contacting Audible and ACX until we got satisfying answers. Audible removed these books and will hopefully take steps to prepare their quality control team for the advanced AI voices they will have to weed out from now on. The quality control team listens to a sample of every audiobook before it goes up for sale.
That means, in general, you can feel pretty safe that Audible won’t sell you an AI-narrated audiobook. But mistakes can happen. AI narration is now so advanced that there will be breathing sounds and different intonations. If you casually listen to a sample, you likely won’t be able to tell that it’s AI. But if you listen to an audiobook that you find oddly bad narrated, it’s not impossible that it’s AI.
The Human Experience
AI voices for audiobooks are comparable to CGI actors in movies. It kind of works, it sometimes works surprisingly well to patch something up. And yet, overall, you want the human experience in art. AI can’t emulate that.
Personally, I would like to see an improved text-to-speech app come out of these AI voice developments. Instead of trying to make more money with audiobooks by cutting out the real voice actors who made audiobooks so popular in the first place, give screenreader users better options! Many books, especially older niche titles, don’t have an audio version. And if we could have that read to us in a reasonably acceptable way, that would be a really cool thing and wouldn’t take anything away from human narrators either.
But replacing human voice actors with AI as a way for publishers to earn more on audiobooks is a bad and shortsighted business strategy. Audiobook listeners care very much about the quality of narration of a book and will not buy (or keep) an audiobook that is not read with that magical spark that only a good voice actor has!
What about you? Would you buy audiobooks with AI voices? Please share your opinion in a comment below!
By the way, if you can’t afford as many audiobooks as you would like, and want to support audiobook narrators and authors, check out these fantastic places to get free audiobook review copies.
Eline Blackman (pronouns: she/they) fell in love with books as a child – with being read to and reading herself. 10 years ago, she bought her first Audible book. It was love at first listen! An average of 250 audiobooks per year has become the new normal and you will rarely see Eline without a wireless earbud. Romance and Fantasy are the go-to genres for this audiobook fan.
Not only might AI voices make more books accessible, they might also be used to make books accessible for those who are hard of hearing. Beautiful reading voices with a lot of dynamics unfortunately are often hard or impossible to understand. Also, we (the hard of hearing) would profit if we could choose a slower and clearer speaker (that works wonders) and if we could choose between different female and male voices. For those who get a cochlear implant, audiobooks are wonderful training material, but finding understandable voices, in particulare when starting rehab is a problem.
Thank you for sharing your opinion and experience! It’s good to know that a better text-to-speech app would also make audiobooks more accessible for people who are hard of hearing.
I can speak somewhat to the aged and disabled perspective. Born with a physical disability, holding an actual book has always been problematic, but it has become increasingly prohibitive in the last few years. Reading ebooks on my desktop was the next best go-to option, but with age and slowly deteriorating eyesight, this becomes a less appealing alternative.
I only started utilizing audiobooks in the last couple of years, and pretty quickly I found a strange thing that I haven’t seen many people mention. That is that, now, I have to maneuver two artists through the gauntlet of my finicky tastes. I must first be happy to read the author’s work and then, from among his or her works, find something read by a narrator who I enjoy listening to. To wit, for me, Stephen King read by Frank Muller is the apotheosis of a great audiobook experience.
Of course, to truly be a reader has always been an adventure of exploring a wide variety of literary voices, and “liking” them is not always the point, but traditionally, if you find a distasteful trend in the works of more than a book or two by an author, you can pretty safely attribute it to the author and whatever editing/production method they employ, and if it’s overly distracting, you move on to someone else. But in audiobooks, there’s the additional possibility that what turned you off was not the writing but the narrator’s delivery of the work.
I clearly digress here, but my not-so-quick side point is that unless you MUST consume a specific title’s content for some reason, reading is most often done for pleasure, and though no author sets out to write an unenjoyable book and no narrator intentionally mutilates the package they agreed to deliver, you do, with audios, already have two parties who may not meet the expectations or standards of your personal enjoyment goals. And, I believe, all three parties – author (including their entire publishing mechanism), narrator (including their entire production mechanism), and listener – inherently recognize and accept their respective roles. As such, audiobooks is a new and distinct medium, more so than ebooks, which in my opinion, is primarily just a paperless format of books.
So, to my main point, if mainstream publishers were to decide to go to AI readers, simply to save money (to make more money, in other words), I believe that is a great and horrendous exploitation of the medium. Further, I fear that for those who have no other options or mediums with which to consume books, such a betrayal in the sacred act of conveying words for pleasure will constitute such a duplicitous blow that the entire endeavor of trying to find something “good” to read will no longer be worth the pursuit.
That may sound extreme, but in many cases, we’re talking about readers who have lost ability after ability to finally find themselves consuming a burger and fries through a straw, and now, straws of reasonable quality will be phased out for those consisting of barbed wire. I mean, robot voices are horrible! A whole book read by “someone” who doesn’t know to incorporate pauses or fluctuations except where the author has placed commas; where every sentence ending with a question mark will receive the same identical upturned voice as every other; where every typo or misspelling in the text will be read as though it’s correct; where 9 out of 10 prepositions like “of” and “about” sound out of place from the rest of the sentence? Hell, no! There are video ads on YouTube selling AI voices, that, after talking for awhile, say “you will probably be amazed to learn that what you’re hearing right now is not a real human!” Uhm, no, that was obvious (to me) in the first 5 seconds.
I suppose I should address the other sides of the debate. While this, for me, is largely connected to the issue of the death of customer service and reflects an irrefutable confirmation of profits being more important than people (in this case, I’m referring to customers or “end users”), it is also the theft of a trade and livelihood from those skilled vocal artisans who have no other way to do the job of narration except the hard way – research, reading, and retakes, and then all the splicing and doctoring. I’ll grieve for them as well, just as I do the service workers of the gas stations of long ago and the supermarkets of (literally) yesterday. Yet, as I mentioned a long, long time ago, I am often quite perturbed by the end product of many narrators. So, if I’m perfectly honest, if AI voices were ever advanced to where the issues I enumerated just above were overcome, I might very well put my listening pleasure above the needs of the struggling artisan. God knows, I’ve certainly tried to encourage Tami Hoag to some better form of writing by refusing to buy her horrible stuff, but I haven’t been too successful with that. What I’m trying to say, not very artfully, I admit, is that there are too many narrators who have no business doing the work already. If we could infuse the profession with a little healthy competition to drop the dregs, I’d be OK with that – after all, there will always be passion and voices and tempo and feeling that only voice actors will be able to rise to, regardless how good AI tech gets. But the truth is, if we start buying (and buying into) AI, it won’t be long before no producers will pay for REAL narrators, despite the realism they can bring. So, readers of all ilks should stand firm with narrators and voice actors of all ilks now (even if that means praising Amazon and Audible for the stand they’re taking), or we’ll all be hurting tomorrow.
P.S. I think Erich Bailey is an AI and they have 35 titles on Audible. How do I confirm it so I don’t find myself reporting someone who just has a robotic voice? I’ve only listened to their samples so far, and I’m quite convinced, but I want to be sure. Is there a group tracking AI voices, where maybe I could give them his name?
Thank you so much for taking the time and sharing your perspective! I agree with most of what you said. The good news is that I wrote that blog post almost a year ago, and throughout 2022, I haven’t noticed any AI-read audiobooks. That’s not to say that there aren’t any. But at least it has not yet caught on as a trend – at least not in the genres I follow closely.
I looked into Erich Bailey and his audiobooks were published between 2013 and 2020 which is a strong indicator that it’s not an AI. It just wasn’t advanced enough back then. Funnily, he seems to adopt a certain melody for some books which reminds me a bit of AI voices too. His fiction books sound much more natural than the nonfiction ones.
Listening to Keep Her Safe by Sophie Hannah and the AI voice is making me insane. I love her work, but the AI parts are AWFUL