sw_pyalsaaudio/doc/src/libalsaaudio.tex

\section{\module{alsaaudio}}

%\declaremodule{builtin}{alsaaudio}	% standard library, in C
\declaremodule{extension}{alsaaudio}	% not standard, in C

\platform{Linux}

\moduleauthor{Casper Wilstrup} % {cwi@aves.dk} % Author of the module code;


\modulesynopsis{ALSA sound support}


The \module{alsaaudio} module defines functions and classes for using
ALSA.

% ---- 3.1. ----
% For each function, use a ``funcdesc'' block.  This has exactly two
% parameters (each parameters is contained in a set of curly braces):
% the first parameter is the function name (this automatically
% generates an index entry); the second parameter is the function's
% argument list.  If there are no arguments, use an empty pair of
% curly braces.  If there is more than one argument, separate the
% arguments with backslash-comma.  Optional parts of the parameter
% list are contained in \optional{...} (this generates a set of square
% brackets around its parameter).  Arguments are automatically set in
% italics in the parameter list.  Each argument should be mentioned at
% least once in the description; each usage (even inside \code{...})
% should be enclosed in \var{...}.

\begin{funcdesc}{mixers}{\optional{cardname}}
List the available mixers. The optional \var{cardname} specifies which
card should be queried (this is only relevant if you have more than one
sound card). Omit to use the default sound card.
\end{funcdesc}

\begin{classdesc}{PCM}{\optional{type}, \optional{mode}, \optional{cardname}}
This class is used to represent a PCM device (both playback and capture devices).
The arguments are: \\
\var{type} - can be either PCM_CAPTURE or PCM_PLAYBACK (default). \\
\var{mode} - can be either PCM_NONBLOCK, PCM_ASYNC, or PCM_NORMAL (the default).\\
\var{cardname} - specifies which card should be used (this is only relevant
if you have more than one sound card). Omit to use the default sound card
\end{classdesc}

\begin{classdesc}{Mixer}{\optional{control}, \optional{id}, \optional{cardname}}
This class is used to access a specific ALSA mixer.
The arguments are: \\
\var{control} - Name of the chosen mixed (default is Master). \\
\var{id} - id of mixer (default is 0) -- More explaniation needed here\\
\var{cardname} specifies which card should be used (this is only relevant
if you have more than one sound card). Omit to use the default sound card
\end{classdesc}


\begin{excdesc}{ALSAAudioError}
Exception raised when an operation fails for a ALSA specific reason.
The exception argument is a string describing the reason of the
failure.
\end{excdesc}

\subsection{PCM Terminology and Concepts}

In order to use PCM devices it is useful to be familiar with some concepts and
terminology.

\begin{description}
\item[Sample] PCM audio, whether it is input or output, consists at the lowest level
of a number of single samples. A sample represents the sound in a single channel in
a brief interval. If more than one channel is in use, more than one sample is required
for each interval to describe the sound. Samples can be of many different sizes, ranging
from 8 bit to 64 bit presition. The specific format of each sample can also vary - they
can be big endian byte order, little endian byte order, or even floats.

\item[Frame] A frame consists of exactly one sample per channel. If there is only one
channel (Mono sound) a frame is simply a single sample. If the sound is stereo, each frame
consists of two samples, etc.

\item[Frame size] This is the size in bytes of each frame. This can vary a lot: if each sample is
8 bits, and we're handling mono sound, the frame size is one byte. Similarly in 6 channel audio with
64 bit floating point samples, the frame size is 48 bytes

\item[Rate] PCM sound consists of a flow of sound frames. The sound rate controls how often
the current frame is replaced. For example, a rate of 8000 Hz means that a new frame is played
or captured 8000 times per second.

\item[Data rate] This is the number of bytes, which must be recorded or provided per second
at a certain frame size and rate.

8000 Hz mono sound with 8 bit (1 byte) samples has a data rate of 8000 * 1 * 1 = 8 kb/s

At the other end of the scale, 96000 Hz, 6 channel sound with 64 bit (8 bytes) samples
has a data rate of 96000 * 6 * 8 = 4608 kb/s (almost 5 Mb sound data per second)

\item[Period] When the hardware processes data this is done in chunks of frames. The time interval
between each processing (A/D or D/A conversion) is known as the period. The size of the period has
direct implication on the latency of the sound input or output. For low-latency the period size should
be very small, while low CPU resource usage would usually demand larger period sizes. With ALSA, the
CPU utilization is not impacted much by the period size, since the kernel layer buffers multiple
periods internally, so each period generates an interrupt and a memory copy, but userspace can be
slower and read or write multiple periods at the same time.

\item[Period size] This is the size of each period in Hz. \emph{Not bytes, but Hz!.} In \module{alsaaudio}
the period size is set directly, and it is therefore important to understand the significance of this
number. If the period size is configured to for example 32, each write should contain exactly 32 frames
of sound data, and each read will return either 32 frames of data or nothing at all.

\end{description}

Once you understand these concepts, you will be ready to actually utilize PCM API. Read on.

\subsection{PCM Objects}
\label{pcm-objects}

The acronym PCM is short for Pulse Code Modulation and is the method used in ALSA
and many other places to handle playback and capture of sampled sound data.

PCM objects in \module{alsaaudio} are used to do exactly that, either play sample based
sound or capture sound from some input source (perhaps a microphone). The PCM object
constructor takes the following arguments:

\begin{classdesc}{PCM}{\optional{type}, \optional{mode}, \optional{cardname}}

\var{type} - can be either PCM_CAPTURE or PCM_PLAYBACK (default).

\var{mode} - can be either PCM_NONBLOCK, PCM_ASYNC, or PCM_NORMAL (the default).
In PCM_NONBLOCK mode, calls to read will return immediately independent of wether
there is any actual data to read. Similarly, write calls will return immediately
without actually writing anything to the playout buffer if the buffer is full.

In the current version of \module{alsaaudio} PCM_ASYNC is useless, since it relies
on a callback procedure, which can't be specified from Python.

\var{cardname} - specifies which card should be used (this is only relevant
if you have more than one sound card). Omit to use the default sound card

This will construct a PCM object with default settings:

Sample format: PCM_FORMAT_S16_LE \\
Rate: 8000 Hz \\
Channels: 2 \\
Period size: 32 frames \\
\end{classdesc}

PCM objects have the following methods:

\begin{methoddesc}[PCM]{pcmtype}{}
Returns the type of PCM object. Either PCM_CAPTURE or PCM_PLAYBACK.
\end{methoddesc}

\begin{methoddesc}[PCM]{pcmmode}{}
Return the mode of the PCM object. One of PCM_NONBLOCK, PCM_ASYNC, or PCM_NORMAL
\end{methoddesc}

\begin{methoddesc}[PCM]{cardname}{}
Return the name of the sound card used by this PCM object.
\end{methoddesc}

\begin{methoddesc}[PCM]{setchannels}{nchannels}
Used to set the number of capture or playback channels. Common values are: 1 = mono, 2 = stereo,
and 6 = full 6 channel audio. Few sound cards support more than 2 channels
\end{methoddesc}

\begin{methoddesc}[PCM]{setrate}{rate}
Set the sample rate in Hz for the device. Typical values are 8000 (poor sound), 16000, 44100 (cd quality),
and 96000
\end{methoddesc}

\begin{methoddesc}[PCM]{setformat}{}
The sound format of the device. Sound format controls how the PCM device interpret data for playback,
and how data is encoded in captures.

The following formats are provided by ALSA:
\begin{tableii}{l|l}{Formats}{Format}{Description}
  \lineii{PCM_FORMAT_S8}{Signed 8 bit samples for each channel}
  \lineii{PCM_FORMAT_U8}{Signed 8 bit samples for each channel}
  \lineii{PCM_FORMAT_S16_LE}{Signed 16 bit samples for each channel (Little Endian byte order)}
  \lineii{PCM_FORMAT_S16_BE}{Signed 16 bit samples for each channel (Big Endian byte order)}
  \lineii{PCM_FORMAT_U16_LE}{Unsigned 16 bit samples for each channel (Little Endian byte order)}
  \lineii{PCM_FORMAT_U16_BE}{Unsigned 16 bit samples for each channel (Big Endian byte order)}
  \lineii{PCM_FORMAT_S24_LE}{Signed 24 bit samples for each channel (Little Endian byte order)}
  \lineii{PCM_FORMAT_S24_BE}{Signed 24 bit samples for each channel (Big Endian byte order)}
  \lineii{PCM_FORMAT_U24_LE}{Unsigned 24 bit samples for each channel (Little Endian byte order)}
  \lineii{PCM_FORMAT_U24_BE}{Unsigned 24 bit samples for each channel (Big Endian byte order)}
  \lineii{PCM_FORMAT_S32_LE}{Signed 32 bit samples for each channel (Little Endian byte order)}
  \lineii{PCM_FORMAT_S32_BE}{Signed 32 bit samples for each channel (Big Endian byte order)}
  \lineii{PCM_FORMAT_U32_LE}{Unsigned 32 bit samples for each channel (Little Endian byte order)}
  \lineii{PCM_FORMAT_U32_BE}{Unsigned 32 bit samples for each channel (Big Endian byte order)}
  \lineii{PCM_FORMAT_FLOAT_LE}{32 bit samples encoded as float. (Little Endian byte order)}
  \lineii{PCM_FORMAT_FLOAT_BE}{32 bit samples encoded as float (Big Endian byte order)}
  \lineii{PCM_FORMAT_FLOAT64_LE}{64 bit samples encoded as float. (Little Endian byte order)}
  \lineii{PCM_FORMAT_FLOAT64_BE}{64 bit samples encoded as float. (Big Endian byte order)}
  \lineii{PCM_FORMAT_MU_LAW}{A logarithmic encoding (used by Sun .au files)}
  \lineii{PCM_FORMAT_A_LAW}{Another logarithmic encoding}
  \lineii{PCM_FORMAT_IMA_ADPCM}{a 4:1 compressed format defined by the Interactive Multimedia Association}
  \lineii{PCM_FORMAT_MPEG}{MPEG encoded audio?}
  \lineii{PCM_FORMAT_GSM}{9600 constant rate encoding well suitet for speech}
\end{tableii}

\end{methoddesc}

\begin{methoddesc}[PCM]{setperiodsize}{period}
Sets the actual period size in frames. Each write should consist of exactly this number of frames, and
each read will return this number of frames (unless the device is in PCM_NONBLOCK mode, in which case
it may return nothing at all)
\end{methoddesc}

\begin{methoddesc}[PCM]{read}{}
In PCM_NORMAL mode, this function blocks until a full period is available, and then returns a
tuple (length,data) where \emph{length} is the size in bytes of the captured data, and \emph{data}
is the captured sound frames as a string. The length of the returned data will be periodsize*framesize
bytes.

In PCM_NONBLOCK mode, the call will not block, but will return \code{(0,'')} if no new period
has become available since the last call to read.
\end{methoddesc}

\begin{methoddesc}[PCM]{write}{data}
Writes (plays) the sound in data. The length of data \emph{must} be a multiple of the frame size, and
\emph{should} be exactly the size of a period. If less than 'period size' frames are provided, the actual
playout will not happen until more data is written.

If the device is not in PCM_NONBLOCK mode, this call will block if the kernel buffer is full, and
until enough sound has been played to allow the sound data to be buffered. The call always returns
the size of the data provided

In PCM_NONBLOCK mode, the call will return immediately, with a return value of zero, if the buffer is
full. In this case, the data should be written at a later time.

\end{methoddesc}

\strong{A few hints on using PCM devices for playback}

The most common reason for problems with playback of PCM audio, is that the people don't properly understand
that writes to PCM devices must match \emph{exactly} the data rate of the device.

If too little data is written to the device, it will underrun, and  ugly clicking sounds will occur. Conversely,
of too much data is written to the device, the write function will either block (PCM_NORMAL mode) or return zero
(PCM_NONBLOCK mode).

If your program does nothing, but play sound, the easiest way is to put the device in PCM_NORMAL mode, and just
write as much data to the device as possible. This strategy can also be achieved by using a separate thread
with the sole task of playing out sound.

In GUI programs, however, it may be a better strategy to setup the device, preload the buffer with a few
periods by calling write a couple of times, and then use some timer method to write one period size of data to
the device every period. The purpose of the preloading is to avoid underrun clicks if the used timer
doesn't expire exactly on time.

Also note, that most timer APIs that you can find for Python will cummulate time delays: If you set the timer
to expire after 1/10'th of a second, the actual timeout will happen slightly later, which will accumulate to
quite a lot after a few seconds. Hint: use time.time() to check how much time has really passed, and add
extra writes as nessecary.

\subsection{Mixer Objects}
\label{mixer-objects}

Mixer objects provides access to the ALSA mixer API.

\begin{classdesc}{Mixer}{\optional{control}, \optional{id}, \optional{cardname}}
\var{control} - specifies which control to manipulate using this mixer object. The list
of available controls can be found with the \module{alsaaudio}.\function{mixers} function.
The default value is 'Master' - other common controls include 'Master Mono', 'PCM', 'Line', etc.

\var{id} - the id of the mixer control. Default is 0

\var{cardname} - specifies which card should be used (this is only relevant
if you have more than one sound card). Omit to use the default sound card
\end{classdesc}

Mixer objects have the following methods:

\begin{methoddesc}[Mixer]{cardname}{}
Return the name of the sound card used by this Mixer object
\end{methoddesc}

\begin{methoddesc}[Mixer]{mixer}{}
Return the name of the specific mixer controlled by this object, For example 'Master'
or 'PCM'
\end{methoddesc}

\begin{methoddesc}[Mixer]{mixerid}{}
Return the ID of the ALSA mixer controlled by this object.
\end{methoddesc}

\begin{methoddesc}[Mixer]{switchcap}{}
Returns a list of the switches which are defined by this specific mixer. Possible values in
this list are:

\begin{tableii}{l|l}{Switches}{Switch}{Description}
  \lineii{'Mute'}{This mixer can be muted}
  \lineii{'Joined Mute'}{This mixer can mute all channels at the same time}
  \lineii{'Playback Mute'}{This mixer can mute the playback output}
  \lineii{'Joined Playback Mute'}{Mute playback for all channels at the same time}
  \lineii{'Capture Mute'}{Mute sound capture}
  \lineii{'Joined Capture Mute'}{Mute sound capture for all channels at a time}
  \lineii{'Capture Exclusive'}{Not quite sure what this is}
\end{tableii}

To manipulate these swithes use the \method{setrec} or \method{setmute} methods
\end{methoddesc}

\begin{methoddesc}[Mixer]{volumecap}{}
Returns a list of the volume control capabilities of this mixer. Possible values in
the list are:

\begin{tableii}{l|l}{Volume Capabilities}{Capability}{Description}
  \lineii{'Volume'}{This mixer can control volume}
  \lineii{'Joined Volume'}{This mixer can control volume for all channels at the same time}
  \lineii{'Playback Volume'}{This mixer can manipulate the playback volume}
  \lineii{'Joined Playback Volume'}{Manipulate playback volumne for all channels at the same time}
  \lineii{'Capture Volume'}{Manipulate sound capture volume}
  \lineii{'Joined Capture Volume'}{Manipulate sound capture volume for all channels at a time}
\end{tableii}

\end{methoddesc}

\begin{methoddesc}[Mixer]{getvolume}{\optional{direction}}
Returns a list with the current volume settings for each channel. The list elements
are integer percentages.

The optional \var{direction} argument can be either 'playback' or 'capture', which is relevant
if the mixer can control both playback and capture volume. The default value is 'playback'
if the mixer has this capability, otherwise 'capture'

\end{methoddesc}

\begin{methoddesc}[Mixer]{getmute}{}
Return a list indicating the current mute setting for each channel. 0 means not muted, 1 means muted.

This method will fail if the mixer has no playback switch capabilities.
\end{methoddesc}

\begin{methoddesc}[Mixer]{getrec}{}
Return a list indicating the current record mute setting for each channel. 0 means not recording, 1
means not recording.

This method will fail if the mixer has no capture switch capabilities.
\end{methoddesc}

\begin{methoddesc}[Mixer]{setvolume}{volume,\optional{channel},\optional{direction}}
Change the current volume settings for this mixer. The \var{volume} argument controls
the new volume setting as an integer percentage.

If the optional argument \var{channel} is present, the volume is set only for this channel. This
assumes that the mixer can control the volume for the channels independently.

The optional \var{direction} argument can be either 'playback' or 'capture' is relevant if the mixer
has independent playback and capture volume capabilities, and controls which of the volumes
if changed. The default is 'playback' if the mixer has this capability, otherwise 'capture'.
\end{methoddesc}

\begin{methoddesc}[Mixer]{setmute}{mute, \optional{channel}}
Sets the mute flag to a new value. The \var{mute} argument is either 0 for not muted, or 1 for muted.

The optional \var{channel} argument controls which channel is muted. The default is to set the mute flag
for all channels.

This method will fail if the mixer has no playback mute capabilities
\end{methoddesc}

\begin{methoddesc}[Mixer]{setrec}{capture,\optional{channel}}
Sets the capture mute flag to a new value. The \var{capture} argument is either 0 for no capture,
or 1 for capture.

The optional \var{channel} argument controls which channel is changed. The default is to set the capture flag
for all channels.

This method will fail if the mixer has no capture switch capabilities
\end{methoddesc}


\textbf{A Note on the ALSA Mixer API}

The ALSA mixer API is extremely complicated - and hardly documented at all. \module{alsaaudio} implements
a much simplified way to access this API. In designing the API I've had to make some choices which
may limit what can and cannot be controlled through the API. However, If I had chosen to implement the
full API, I would have reexposed the horrible complexity/documentation ratio of the underlying API.
At least the \module{alsaaudio} API is easy to understand and use.

If my design choises prevents you from doing something that the underlying API would have allowed,
please let me know, so I can incorporate these need into future versions.

If the current state of affairs annoy you, the best you can do is to write a HOWTO on the API and
make this available on the net. Until somebody does this, the availability of ALSA mixer capable
devices will stay quite limited.

Unfortunately, I'm not able to create such a HOWTO myself, since I only understand half of the API,
and that which I do understand has come from a painful trial and error process.


% ==== 4. ====
\subsection{ALSA Examples \label{pcm-example}}

For now, the only examples available are the 'playbacktest.py' and the 'recordtest.py' programs included.
This will change in a future version.