One of the things that I found pretty confusing about GNU/Linux during my transition from using Windows as my primary OS to using GNU/Linux, was how audio worked.
In Windows, you don’t really have to think about anything, or know how to configure any specific utilities for the most part; audio just works. You might need to install a driver for a new headset or soundcard but that’s about as heavy as things get.
Audio in GNU/Linux has come a long way and nowadays functions fairly well when it comes to the simplicity that users migrating from Windows are accustomed to; but there are still some nuances and terms that new users may not be familiar with.
This article is not meant to delve too deeply into things, this will likely just be common knowledge for anyone with mild experience in the GNU/Linux world, but hopefully this will help clarify some things for the greenhorns.
The image below, shows how sound works in GNU/Linux, which will be expanded upon:
ALSA stands for, “Advanced Linux Sound Architecture” and is the root of all sound in modern GNU/Linux distributions. In short, ALSA is the framework that sound drivers communicate through, or in itself you could somewhat refer to it as a sound driver itself; sort of.
There was another somewhat similar system called OSS (Open Sound System) that some people still prefer, but it’s mostly been phased out and is rarely used anymore.
ALSA is nowadays the basis for all sound in a GNU/Linux system. The Kernel (Linux itself) communicates with ALSA, which then turn communicates with an audio server such as PulseAudio, which then communicates with the applications on the system. You can still have audio without a server like PulseAudio, but you lose a lot of functionality and customization; as well as other features we will cover shortly.
PulseAudio is included with practically every major pre-built GNU/Linux operating system. Ubuntu, Opensuse, Manjaro, Mageia, Linux Mint etc, all use PulseAudio for example.
I don’t generally like referencing Wikipedia, but a great explanation of PulseAudio can be found there in better words than I might have used...
“PulseAudio acts as a sound server, where a background process accepting sound input from one or more sources (processes, capture devices, etc) is created. The background process then redirects mentioned sound sources to one or more sinks (sound cards, remote network PulseAudio servers, or other processes).”
Essentially, PulseAudio directs the sound it receives from ALSA, to your speakers, headphones, etc.
Without PulseAudio, typically ALSA can only send sound to one place at a time. PulseAudio on the other hand allows sound to come from multiple sources at once, and be sent out to multiple places at the same time.
Another feature of PulseAudio is the ability to control volume for separate applications independently. You can turn UP Youtube in your browser, and turn DOWN spotify, without having to adjust the volume as a singular entity, for example.
Most Desktop Environments have their own utilities / tray tools for changing volumes / listening devices through PulseAudio, but there is an application called ‘pavucontrol’ that can be installed if you want to mess with PulseAudio directly, and see exactly what I’m referring to. It’s straight-forward and easy to figure out, and the package is available in practically every distributions repositories.
PulseAudio has numerous other features, but we will move on, however if you want more information on PulseAudio you can get it here.
JACK stands for JACK Audio Connection Kit. JACK is another Sound Server similar to PulseAudio, but is more commonly used among DJ’s and audio professionals. It’s quite a bit more technical, however it does support things like lower latency between devices, and is very useful for connecting multiple devices together (like Hardware Mixers, turntables, speakers etc, for professional use.) Most people will never need to use JACK, PulseAudio works quite fine unless you need JACK for something specific.
Audio on GNU/Linux ‘sounds’ more complicated than it really is (see what I did there), and hopefully this article will help things to make a little more sense when you’re browsing the web and seeing names like ALSA or PulseAudio being thrown around!
Ghacks is a technology news blog that was founded in 2005 by Martin Brinkmann. It has since then become one of the most popular tech news sites on the Internet with five authors and regular contributions from freelance writers.