Every now and then, something will shake up the very foundation of the audiobook world. Recently, it has been the news from a publisher that they are developing AI voices for audiobooks. Understandably, audiobook narrators were shocked and appalled because their very livelihood is being threatened by a company many have worked with.
But what do audiobook listeners and fans think?
I asked in my Facebook group and got a lot of very interesting feedback. Below, you will find the arguments for and against AI audiobook voices from the point of view of avid audiobook fans. But first…
Table of Contents
**The marked links and book covers on this page are affiliate links. If you use them to purchase something, I earn a fee at no additional cost for you. Disclosure**
What are AI Audiobook Voices?
Well, it is basically a (more or less) advanced form of text-to-speech. Instead of having an app read an ebook to you, you are supposed to buy these audiobooks individually though. Their big advantage for publishers is that they save the cost on an actual human voice actor. And AI audiobooks are faster to produce. There is still editing and post-production happening though. And the AI voices are based on human voice actors. So, the cost isn’t actually that much lower. But we will get back to this topic later.
AI voices should be able to read the emotions in a scene from the context and adjust their narration accordingly. That’s where artificial intelligence comes in. The goal of AI voice developers is to have them be indistinguishable from human narrators and they claim to already be there.
Concerns in regard to AI Voices for Audiobooks
Many common points of criticism we audiobook fans have when it comes to certain human narrators (e.g. because they aren’t experienced yet or don’t have enough training) translate into concerns about AI voices for audiobooks. Is it easy to keep different characters apart during dialogue? Can the listener recognize what a character is thinking versus what they are saying out loud? Are inflection and pronunciation right? Does the narrator transport the emotions in a scene and draw us into the story?
Many people who only consume their books in audio format (be it because of a disability or circumstance) have experience with text-to-speech apps. Sometimes an ebook doesn’t have an audio version and if we really want to read that story, text-to-speech might do in a pinch.
AI voices for audiobooks fall on a scale between experienced voice actors on the one side and text-to-speech on the other. When you think of a Fiction novel with lots of dialogue and many emotions, a narrator uses a range of inflections and different voices to transport what is happening in the story. That makes it easier for listeners to follow, they can tell right away who is talking or thinking something and how the characters are feeling. Text-to-speech can’t do that at all.
And AI voices are not there yet either (I listened to the DeepZen samples and… yeah… this might “astound” people who never listen to audiobooks, but to audio fans who listen to narrators every day these samples just sound robotic). While the AI can recognize emotions from the context and change pitch, it isn’t remotely comparable to what human voice actors do (just for fun, check out these samples by Joel Leslie). AI falls flat when listeners expect to be pulled into a story.
If you have ever listened to an audiobook read by a narrator who just reads the words and doesn’t do voices, you might know how difficult it is to follow a Fiction book or a memoir if there isn’t any acting and if the reader doesn’t bring out the emotions in a story!
However, AI voices might work out decently for Nonfiction and textbooks, which don’t require a lot of emoting. But why would we pay for that instead of just using a text-to-speech app in the first place?
Related article: Audiobooks vs Reading Print/Ebooks
Anger about AI Audiobook Voices
Many audiobook fans are “narrator-motivated”, meaning, they would happily buy an audiobook from an author they don’t know, or in a genre they don’t usually favor, as long as it is narrated by their favorite. “I would listen to him/her read the phonebook!” is a popular line. New authors trying to get into the audiobook business can get a strong start if they can hire a popular voice actor because the talent will bring their own fanbase and avenues of promotion.
Every audiobook listener I know, even the casual ones, has a favorite narrator. And that name will always be a draw for them even if they never go to Twitter, Facebook, Insta, and just browse Audible a bit every other month. There is a reason why Audible gives you suggestions for other titles read by the narrator you just listened to!
And the die-hard audiobook narrator fans are actively angry about AI voices for audiobooks! They fear that the market will be smaller for their favorite human voices. And they feel it is simply shitty to do this at all because the narrator is so important for our audiobook experience.
For new voice actors or lesser-known ones doing niche titles, this is an even bigger problem. They don’t have much of a fanbase and they still have to be more expensive than AI voices. It will be hard for them to compete and their livelihood is even more threatened by this development.
For those of us listening to audiobooks every day, loving our favorite narrators dearly, and basically living and breathing audiobooks, the whole discussion about AI voices for audiobooks can feel a bit like a slap in the face because an audiobook is a very intimate experience. You have a person in your ear, telling you a story, eliciting emotions in you. The idea that this very human, very personal side of it gets ignored and we would have to listen to a robot voice feels foreign and worrying. It devalues our beloved voice actors as well as our own aural experience.
And that means some audiobook fans refuse to ever pay for an AI-generated audiobook on principle ground, no matter how good the AI voices get.
I can also see a vicious circle in the making here: Indie authors wanting to bring out their first audio, scared of the investment, choosing AI because it is a bit cheaper, then their audiobook won’t sell and they won’t bring out their books in audio format again “because it isn’t worth it”. The audiobook market is very distinct from the ebook market and needs its own marketing. Audiobooks as an afterthought hardly work. And they definitely won’t work if new authors try to save on the voice talent!
Related article: How do Authors get paid for Audiobooks from Libraries and Stores?
Benefits of AI Voices for Audiobooks
Money is important. Not just for authors trying to live off of their work (or at least not losing tons of money on it), but also for listeners – especially those on disability aid, those without library access, and those who can’t for whatever reason fall back on ebooks or second-hand print books. So, if AI-voiced audiobooks were to cut the cost considerably, and this would translate into noticeably lower costs for listeners, then that is a good thing.
If AI voices were so cheap, that indie authors who don’t publish in audio format would offer their titles as audiobooks, that would be a great thing as well, especially for readers with disabilities.
However, as I said earlier, it is dangerous to assume that an audiobook should sell well just because the format is popular. Most often with indie books, it is the narrator who will first draw listeners to an author who is new to the audiobook market. If an author decides to go with a synthesized voice, they have to be aware that their target audience will be much smaller than it would be with a popular voice talent. They make their story more accessible though, which is appreciated!
Audible and ACX
Until recently, a lot of the AI audiobook discussion has been theoretical. DeepZen’s samples are bad. And Audible, the biggest audiobook marketplace in the world, doesn’t allow AI-narrated audiobooks.
So, what’s the fuss anyway, right?
Well, recently, it turned out that several AI audiobooks have been up for sale on Audible. The narration came from another company with much more advanced AI voices. Since Audible doesn’t allow non-human narration, the audiobook pages mentioned nowhere that it’s not a real narrator. On top of that, the voice had a name very similar to that of a human voice actor. In fact, if you clicked on the narrator link on the Audible page, you would get all the results from the real person!
So, in reality, publishers used the fact that Audible doesn’t allow AI voices to deliberately mislead customers. The narration was good enough for these audiobooks to pass ACX quality control and were then offered at full price on Audible. Instead of passing on the savings by using a synthesized voice, these publishers tried to earn more, at the expense of the customers who would get a sub-par narration, the authors who would see fewer sales on their audiobook (who would likely only be paid at all once a threshold is reached), and the voice actors who lost out on these jobs.
Several narrators and I kept contacting Audible and ACX until we got satisfying answers. Audible removed these books and will hopefully take steps to prepare their quality control team for the advanced AI voices they will have to weed out from now on. The quality control team listens to a sample of every audiobook before it goes up for sale.
That means, in general, you can feel pretty safe that Audible won’t sell you an AI-narrated audiobook. But mistakes can happen. AI narration is now so advanced that there will be breathing sounds and different intonations. If you casually listen to a sample, you likely won’t be able to tell that it’s AI. But if you listen to an audiobook that you find oddly bad narrated, it’s not impossible that it’s AI.
The Human Experience
AI voices for audiobooks are comparable to CGI actors in movies. It kind of works, it sometimes works surprisingly well to patch something up. And yet, overall, you want the human experience in art. AI can’t emulate that.
Personally, I would like to see an improved text-to-speech app come out of these AI voice developments. Instead of trying to make more money with audiobooks by cutting out the real voice actors who made audiobooks so popular in the first place, give screenreader users better options! Many books, especially older niche titles, don’t have an audio version. And if we could have that read to us in a reasonably acceptable way, that would be a really cool thing and wouldn’t take anything away from human narrators either.
But replacing human voice actors with AI as a way for publishers to earn more on audiobooks is a bad and shortsighted business strategy. Audiobook listeners care very much about the quality of narration of a book and will not buy (or keep) an audiobook that is not read with that magical spark that only a good voice actor has!
What about you? Would you buy audiobooks with AI voices? Please share your opinion in a comment below!
By the way, if you can’t afford as many audiobooks as you would like, and want to support audiobook narrators and authors, check out these fantastic places to get free audiobook review copies./