I’m setting up a laptop with Mint (Cinnamon) for a person who needs text-to-speech software. It seems like most of the nice-sounding ones are proprietary. Any recommendations for FOSS alternatives? And any ideas why this is an underdeveloped area for open source?

  • It seems like most of the nice-sounding ones are proprietary.

    That’s pretty standard. Most FOSS projects don’t have corporations feeding them 100’s of thousands of dollars. Even when they do, well people still say gimp is far worse than ps. Blender is one of the rare complex projects that can compete with proprietary alternatives.

    And any ideas why this is an underdeveloped area for open source?

    My best guess is that it’s really expensive and time consuming. I’d be surprised if those really good proprietary models didn’t cost $100k+ just for training.

  •  h3ndrik   ( @h3ndrik@feddit.de ) 
    link
    fedilink
    4
    edit-2
    9 months ago

    It’s been an underdeveloped topic for some time. espeak-ng is available on most distros and has some integrations available that somewhat tie it into the desktop. There are more modern solutions that sound way better. For example Coqui’s xtts2, maybe Piper which is part of Home Assistand nowadays. If your language is English, you got quite some more solutions available to choose from. But it’s a mixed bag if they sound nice, are easy to install (that also depends on which Linux distro you use and if it’s available as a package) and if they tie into the rest of the system. I’m not an expert on this, but I’d also like to have TTS and STT available on my Linux desktop witout putting to much effort into it.

  • TTS with coqui xTTS is fun to run with a known voice (10sec wav file is enought). It requires some resources but far less than STT like faster-whisper. I think the main issue is not running them but integrate them with the OS/softwares.