I just listened to this AI generated audiobook and if it didn’t say it was AI, I’d have thought it was human-made. It has different voices, dramatization, sound effects… The last I’d heard about this tech was a post saying Stephen Fry’s voice was stolen and replicated by AI. But since then, nothing, even though it’s clearly advanced incredibly fast. You’d expect more buzz for something that went from detectable as AI to indistinguishable from humans so quickly. How is it that no one is talking about AI generated audiobooks and their rapid improvement? This seems like a huge deal to me.
simple ( @simple@lemm.ee ) English79•1 year agoA lot of people just aren’t aware of how fast AI is moving. AI voices were pretty meh earlier this year. A lot of people working on the audiobook/voice acting scene have been talking about this though.
driving_crooner ( @driving_crooner@lemmy.eco.br ) 30•1 year agoI recommend everyone to check the YouTube channel “two minute papers” who have being doing videos about papers on AI for the last 10 years on so to see the accelerated progress AI have. Like 5 years ago those images generating AI looked like LSD infused dreams and now they look almost perfect.
Magrath ( @Magrath@lemmy.ca ) 7•1 year agoI wish I could watch his videos but the way he talks is awful. It’s like some exaggerated evolution of YouTube talk.
Liempong_pagong ( @Liempong_pagong@beehaw.org ) 1•1 year agoIt’s great to be alive!
LadyLikesSpiders ( @LadyLikesSpiders@lemmy.ml ) 68•1 year agoAh yes, Audio AI. I can’t wait for this rapidly-approaching future where you literally won’t be able to trust the validity of anything your senses tell you anymore
AdmiralShat ( @AdmiralShat@programming.dev ) English22•1 year agoImagine the day when people post videos of the president saying literally anything with pitch perfect audio voice synth
Imagine going to prison for a generated clip of you confessing to a crime.
FaceDeer ( @FaceDeer@kbin.social ) 18•1 year agoOnce the tech is that good, a recording of your confession will be useless as evidence in court.
AdmiralShat ( @AdmiralShat@programming.dev ) English9•1 year ago…but it is already that good? The fact that celebrities are having to come out and say it wasn’t them in an ad is proof enough that it can fool people
You only need to fool a jury
FaceDeer ( @FaceDeer@kbin.social ) 8•1 year agoThen we’ll have to take more care with how jury trials are conducted. It’s always been possible to fool juries, that’s often a lawyer’s entire strategy.
Shyfer ( @Shyfer@ttrpg.network ) 15•1 year agoOr imagine politicians like Trump saying the most heinous stuff and then denying it saying it’s fake or AI. How will people know? You won’t even be able to trust your eyes or ears anymore.
Helix 🧬 ( @Helix@feddit.de ) 4•1 year agoGuss we’ll have to resort to digital watermarking with personal certificates then.
FaceDeer ( @FaceDeer@kbin.social ) 4•1 year agoHave you watched a movie, ever? There have always been special effects trickery.
LadyLikesSpiders ( @LadyLikesSpiders@lemmy.ml ) 6•1 year agoYes, but you could tell they weren’t real. They still needed real voice actors, real sound design, studios and stages and resources. Anyone with a halfway decent rig can fake shit to a very believable degree. Even with CGI you swear is fantastic, you see its fakeness once the novelty wears off
lol3droflxp ( @lol3droflxp@kbin.social ) 2•1 year agoI guess that AI generated stuff will also have some telltale signs of being fake for quite some time if you actually look for it.
LadyLikesSpiders ( @LadyLikesSpiders@lemmy.ml ) 1•1 year agoIt’s still not perfect, but it gets exponentially better every day
theskyisfalling ( @theskyisfalling@lemmy.dbzer0.com ) 23•1 year agoAs someone who only consumes books in audiobook form this is great news for me, I tried to listen to some automatically generated audio books around 2 years ago and I found them horrible to listen to just because they sounded so off.
I’d love to be able to copy in the text of a book and get actually listenable (is that a proper word?) audiobook out of the other side for some books that will just simply never be recorded by actual people due to being too old / obscure.
I’ve been wanting to be able to listen to the Pelucidar books for years but they just don’t exist in audio format, is there somewhere publically available that I can do this?
Just curious, but how come you only consume books in audio format? (Please forgive me if this was rude to ask.)
Bldck ( @Bldck@beehaw.org ) English7•1 year agoNot OP, but I almost exclusively read novels and non fiction via audiobooks. For context, I’m on pace for 70 books this year.
My main reason for audiobooks is I having a driving commute. Two hours a day round trip. Audiobooks keep me sane in a way that podcasts or music do not. I also do audiobooks when doing chores around the house.
Second, I struggle to focus on reading a book on my phone. Too many distractions and I think the reading experience is subpar. I do have an eInk reader, but I haven’t charged it in years because it’s easier to do audiobooks.
Physical books are rare in my home, but that’s a self-reinforcing cycle since I enjoy audiobooks so much.
saigot ( @saigot@lemmy.ca ) 5•1 year agoI like to read books before bed, but need darkness for a while before I have any chance of going to sleep, so me and my wife listen to 45min of audio book a night before going to sleep. Plus when we listen together there is no need to worry about getting ahead of each other and spoiling stuff.
I read books in other scenarios but that ritual is by the most time I have for reading and the most consistent as well.
Aww that seems sweet. :)
Catoblepas ( @Catoblepas@lemmy.blahaj.zone ) 4•1 year agoPersonally I mostly use audio books instead of reading because I get eye strain a lot easier than I used to. I go to an eye specialist for unrelated issues yearly, so it’s not an issue with a wrong lens prescription. It’s not a problem when I’m doing a low attention task where I can look away frequently, but for reading it sucks.
That makes sense. Yeah, eye issues are nothing to bawk at. They can be really debilitating if handled incorrectly.
Not rude at all, similar to the other responses people have given but it oa two fold really. Firstly I just don’t do well with sitting and reading a book, I get bored very quickly, can’t concentrate on what is happening and start re-reading sentences or pages over and over where I am not paying attention properly. Additionally after only a couple of pages it will start putting me to sleep, I guess my attention span is just not sufficient for this form of media.
As a result I never read any books until I discovered audiobooks and my love for them, I honestly just disregarded books as a form of entertainment and thought they were a waste of time until discovering this way to consumer them which wasn’t until I was in my early 30s.
On top of that I now listen to them mostly at work, I work with industrial machines and the work is repetitive as fuck and having a book to listen to makes the time go a lot faster and in a lot more interesting manner. Consequently I now love books and will listen to between 6 and 10 hours a day and now listen to them when I’m doing things like cooking, cleaning or running when I am not at work.
crank ( @crank@beehaw.org ) English2•1 year agoBack in the 19th century when unions were powerful and innovative, a lot of people had jobs where they had to sit and do repetitive tasks in a room all day. A lot of it was handwork that didn’t have big loud machines.
So one of the demands made by workers in such situations was that the employer would pay someone to come in and provide entertainment such as reading a book or giving talks on subjects of interest. The book or lecturer of course being selected by the workers via the democratic process of the union. And then of course the workers became way more educated because they suddenly had 8-12 hours daily to read books together. Since knowledge is power, the workers became stronger and more decisive in their collective actions.
When you are listening to audiobook at work you can know you are in a long tradition of workers exercising power over their job conditions. Although now it is individualized in the implementation. The desire to have your mind even though the job has your body and some concentration is universal.
I understand! Thank you for explaining! ^_^
crank ( @crank@beehaw.org ) English6•1 year agoWell you can always pay someone to read it for you. Blind people do that.
Are any of these books public domain? If so the print version could be eligible for inclusion at Project Guttenberg. PG has very specific docs about eligibility for this. You could probably get a scan from archive.org if you don’t have one. You would have to clean up the OCR by hand.
Then it would eligible to be requested from the volunteer (human) readers who have been pumping out Libra audio books for years at LibriVox.
Recently I saw Gutenberg has a collab. They are producing and distributing Libre guidebooks generated by AI. I believe I read on one of the pages they have 4000 done. I haven’t tried it out but I guess I should.
Project Gutenberg, Microsoft, and MIT have worked together to create thousands of free and open audiobooks using new neural text-to-speech technology and Project Gutenberg’s large open-access collection of e-books. This project aims to make literature more accessible to (audio)book-lovers everywhere and democratize access to high quality audiobooks. Whether you are learning to read, looking for inclusive reading technology, or about to head out on a long drive, we hope you enjoy this audiobook collection.
I assume this is also a great benefit as fertilizer down at the old AI content farm which is otherwise totally run over with reddit shitposts.
If anyone tries it let me know how it goes.
The books I specifically mentioned are now public domain as they are old enough and librevox is where I actually started my audiobook (and books in general) journey. One of them is on there but it is only the second book of what is a 5 or more book series which is kinda frustrating.
The volunteer readers are very hit and miss however and I find that more than half are just not listenable for me due to different reasons from poor actual recordings, poor reading ability by the reader with excessive pauses added “errs and ummms” to mispronunciation of words constantly. These are pedantic reasons maybe and I throw no shade over it to the people that have volunteered their time to read these books but I just can’t listen to them personally for the same reason I could never get through any amount of time with a robotic text to speech program of the past.
I’ll look into the project Gutenberg thing however, thanks for making me aware of it and see what is up with that :)
crank ( @crank@beehaw.org ) English2•1 year agoTotally true about the librivox readers. They are doing their best. :) There are some total gems in there. But I have definitely given up on a few of them. OTOH I have given up on professionally read audiobooks too for all sorts of reasons.
Absolutely, I love some of the librevox readers and have found new books I enjoyed immensely just from seeing what other things the ones I enjoyed had read, i found it a good way to find new books for a while because usually they are reading other books they personally enjoy that are similar to the one I had looked for initially.
Likewise just because they are “professionally read” doesn’t make them good by default. Some peoples voices or accents just don’t sit well with me trying to listen to them which is no fault of their own and personal preference on my part but some are just plain bad and I can’t believe someone paid them for that work and found it acceptable enough to release it into the wider world :D
WebTheWitted ( @WebTheWitted@beehaw.org ) 3•1 year agoI’m pretty sure that Amazon tried to do this with Kindle a few years ago and got sued by book publishers.
Ahh, it was audible.
It’s only a matter of time though before this sort of thing is ruled on and deals are inked. Open source is already getting pretty far too.
The books I mentioned and had in mind are old enough that they are now public domain and so this issue would not affect them. :)
Gamma ( @GammaGames@beehaw.org ) English2•1 year agoThis is a great use for it!
Terrasque ( @theterrasque@infosec.pub ) 1•1 year agoLook at the description of the video. It’s not automatically generated. He made several voices and narrator and applied it to each character.
While insanely cool, it’s not “put in book here, get audio book there”
Yes I realise that and was over simplifying in this response but as I stated in another comment I would be more than happy to work on prompts for myself if it could generate something satisfactory to listen to.
The video posted by OP still sounds a bit “dead” so I don’t think the tech is quite there yet but it is promising for the future the way it is headed.
Bebo ( @Bobo@lemm.ee ) English20•1 year agoI want TTS made better with AI so that I won’t need huge audiobooks filling up my phone. The epubs that I already have would serve as audiobooks when needed.
bionicjoey ( @bionicjoey@lemmy.ca ) 6•1 year agoIf your phone is rendering TTS on the fly that’s probably going to be a drain on battery.
Bebo ( @Bobo@lemm.ee ) English4•1 year agoI have frequently used tts for listening to epubs. I have, however, not noticed much battery drain… And it’s not as enjoyable as listening to an audiobook read by a narrator you like but it kind of works to a certain extent. So I wish you tts would get better.
Gamma ( @GammaGames@beehaw.org ) English14•1 year agoBecause it has the potential to become actively harmful to the audiobook industry
Akrenion ( @Akrenion@programming.dev ) 8•1 year agoAnd great for accesibility for people who can not read well.
GoldELox ( @Gold_E_Lox@lemmy.blahaj.zone ) 5•1 year agowhy should i care about the audio book industry? The biggest player is Amazon, it doesn’t add value to the art form, its just another way to become informed, and the more people who have that ability the better.
Gamma ( @GammaGames@beehaw.org ) English2•1 year agoBecause they are people. There are other options, you don’t have to support Amazon.
crank ( @crank@beehaw.org ) English3•1 year agoI think it is true because if they get the tech right the market could be saturated and voice actors will be in lower demand.
And the situation is already terrible for these workers. >90% of people buy and consume books via Audible which is owned by Amazon. As I’m sure you can guess there is lots of shady stuff going on. Such as (but not limited to) the “Audiblegate” campaign where workers discovered Amazon was engaging massive systemic wagetheft. As situation which is still ongoing to the best of my knowledge.
- Audiblegate - main website for “audiblegate” campaign
- Audiblegate Campaign: Fair Deal for Rights Holders - The Alliance of Independent Authors
- #Audiblegate: ALLi Campaign Update — Self-Publishing Advice Center from the Alliance of Independent Authors
- The Truth Behind Audible Subscription Earnings (2023) - this page has detailed information including correspondence, spreadsheets etc if anyone really wants to dive in - however it appears to have been republished from another source which I can’t identify
Some further context about Audible:
- Cory Doctorow is a Bestselling Author, but Audible Won’t Carry his Audiobooks - talking about the tech side, DRM and so-called Intellectual property
chicken ( @chicken@lemmy.dbzer0.com ) 8•1 year agoAudiobooks are offputting to me and I strongly prefer to read text, but this seems like a great thing overall for making books more accessible to people. More people experiencing a wider range of books is good.
Zikeji ( @Zikeji@programming.dev ) English3•1 year agoAudiobooks have been a great coping mechanism for my ADHD, they’ve also made me a better driver.
For the latter, if I listen to my music I definitely feel a bit more aggressive, whereas if it’s an audiobook (and I’ve given myself sufficient room), I’m much more forgiving.
For the former, I can mix them with menial tasks and it makes them so much more doable.
milicent_bystandr ( @milicent_bystandr@lemm.ee ) 8•1 year agoThat sounds pretty cool, though I’d be concerned it will suffer from the classic problem of current AI (…and humans, but that’s by the by) of confident incorrectness. Like an automatic transmission can miss meanings and types of context that a human will spot, programmatically generating speech can probably mess up punctuation and flow - even the way a human reader sometimes will get part way through a sentence and realise they need to start again for it to come out right.
That said, I can’t see it being a big problem for most works, just unfortunate here and there. For once it seems an AI application short on downsides! (Except for the usual economic ones for many people previously trained in the field.)
maxprime ( @maxprime@lemmy.ml ) 7•1 year agoI’ve been getting into audiobooks in a big way recently. This is interesting but somehow seems off to me. Maybe I’ll try listening to one and have my mind changed. We’ll see!
bonn2 ( @bonn2@lemm.ee ) 4•1 year agoThere are also a few AI sung songs out there that are pretty good. Most of them sound pretty Autotuny, but to some extent, that can be a style. Aura, by Ghost, is a good example. If I didn’t know it was ai, I would just think it was autotune.
BlazingFlames6073 ( @BlazingFlames6073@lemdro.id ) English3•1 year agoThis is amazing. I’m the future, I’'d like to try this on old books I’ve read in the past just to check
lightnsfw ( @lightnsfw@reddthat.com ) 2•1 year agoPersonally I don’t consume audiobooks so this doesn’t affect me at all.
𝕸𝖔𝖘𝖘 ( @01189998819991197253@infosec.pub ) English1•1 year agoIt sounds like a generative model to me, but it’s probably the best one I’ve ever heard. Also, thanks for the link! I added it to my listen list!
ddh ( @DarkDarkHouse@lemmy.sdf.org ) English1•1 year agoBecause it’s not a new product.