Linux News
The world is talking about GNU/Linux and Free/Open Source Software

Login

If you don't have an account yet, visit the registration page to sign up.

If you already have an account, you may login here:

Today's Big Story

An Everyday User’s Guide to Ubuntu’s New 25.04 Plucky Puffin Release

LXer Features

Linux That's Small

Encryption, Trust, and the Hidden Dangers of Vendor-Controlled Data

My Linux Mint Tribute

How I Turned My Chromebook Into A "Mintbook"

Adventures With My New Chromebook

My Linux Laptop

Have something to say?

Ready to be published? LXer is read by around 350,000 individuals each month, and is an excellent place for you to publish your ideas, thoughts, reviews, complaints, etc. Do you have something to say to the Linux community?

Publish it here.

DaniWeb Linux Community
An exciting professional discussion group about software development, php, shell scripting, networking, ruby, and more.

Latest Discussions

Time travel baby!

Are they

Well... probably not as good as some might think

Plasma-desktop 4:6.3.0-1 MIGRATED to Debian testing

Uh, you're being lied to.

Debian May Be Leaning Towards Systemd Over Upstart

Retro Remake products

They're Blocking Linux Articles While Running Linux Servers?

How much did they pay?

I still like Damn Small Linux Myself

More...

Site Menu

Other News

- LWN.net
Their weekly coverage of Linux news is unmatched in this community.

- LinuxGizmos.com
Excellent news for embedded Linux.

- LinuxQuestions.org
Discussion forums for Linux users.

LinuxQuestions.org is a friendly and active Linux Community with forums, reviews, a hardware compatibility list, a wiki, tutorials, a download site, a podcast and more.

Machine Learning in Linux: Bark - Text-Prompted Generative Audio

Posted by sde on Jun 21, 2023 1:28 PM CST
LinuxLinks.com; By LinuxLinks

Mail this story
Print this story

One of the standout machine learning apps is Stable Diffusion, a latent text-to-image diffusion model capable of generating photo-realistic images given any text input. We’ve explored quite a few hugely impressive web frontends such as Easy Diffusion, InvokeAI, and Stable Diffusion web UI.

Extending this theme but from an audio perspective, step forward Bark. This is a transformer-based text-to-audio model. The software can generate realistic multilingual speech as well as other audio – including music, background noise and simple sound effects, from text. The model also generates nonverbal communications like laughing, sighing, crying, and hesitations.

Bark follows a GPT style architecture. It is not a conventional Text-to-Speech model, but instead a fully generative text-to-audio model capable of deviating in unexpected ways from any given script.

Full review

Full Story

Nav

» Read more about: Story Type: Reviews; Groups: Linux, Multimedia, Python

« Return to the newswire homepage

This topic does not have any threads posted yet!

You cannot post until you login.

Linux NewsThe world is talking about GNU/Linux and Free/Open Source Software