Streaming/Pipeline/Plugin Options for Standards

Streaming Options

 

Audio or Media Application Programmer Interfaces (APIs)

OpenMax
http://en.wikipedia.org/wiki/OpenMAX
OpenMax provides a programming interfaces that provides abstractions for routines especially useful for audio, video, and still images. OpenMAX is intended for devices that process large amounts of multimedia data. Is specifically designed for embedded and/or mobile devices. This API has 3 layers. The middle layer is mentioned as being for connecting GStreamer and similar Media Frameworks to codecs etc. This is a new standard and is still being developed. It is part of the Khronus group of standards.

Audiere
http://audiere.sourceforge.net/
http://en.wikipedia.org/wiki/Audiere
An older audio high level API compatible with Linux as well as Windows and other Operating Systems.

PortAudio
http://www.portaudio.com/
PortAudio is a free, cross platform, open-source, audio I/O library. It lets you write simple audio programs in 'C' that will compile and run on many platforms including Windows, Macintosh (8,9,X), Unix (OSS), SGI, and BeOS. PortAudio is intended to promote the exchange of audio synthesis software between developers on different platforms.

Open Media Library: - OpenML
http://en.wikipedia.org/wiki/OpenML
A Programming Environment designed for capturing, transporting, processing, displaying, and synchronizing digital media (2D and 3D graphics, audio and video processing, I/O, and networking).
OpenML did not achieve wide adoption in the industry and current development is not strong. It is part of the Khronus group of standards.

OpenAL
http://en.wikipedia.org/wiki/OpenAL
An audio API designed for efficient rendering of multichannel three dimensional positional audio. This seems to be all about rendering three dimensional sound and would add very little to the Humanise.org project.

OpenSL_ES
http://en.wikipedia.org/wiki/OpenSL_ES
This standards seems to be similar to OpenAL. This is a new standard, basically to replace OpenAL, and is still being developed. It is part of the Khronus group of standards. This is unlikely to add much to the Humanise.org project

Streaming Frameworks.

GStreamer
http://www.gstreamer.net/
http://en.wikipedia.org/wiki/GStreamer
GStreamer is a pipeline based multimedia framework written in the C programming language with the type system based on GObject. GStreamer allows a programmer to create a variety of media-handling components, including simple audio playback, audio and video playback, recording, streaming, and editing.
Designed to be cross-platform, it is known to work on Linux (x86, PowerPC and ARM), Solaris (x86 and SPARC), Mac OS X, Microsoft Windows and OS/400. GStreamer has bindings for programming-languages like Python, C++, Perl, GNU Guile and Ruby. GStreamer is free software, licensed under the GNU LGPL.

GStreamer WinBuilds
http://www.gstreamer-winbuild.ylatuya.es/doku.php?id=start
The GStreamer WinBuilds project goal is to provide precompiled GStreamer binary packages for Microsoft Windows, including a large collection of encoding/decoding plugins.

Phonon
http://en.wikipedia.org/wiki/Phonon_(KDE)

Phonon is the multimedia API for KDE 4. Phonon was created to allow KDE 4 to be independent of any single multimedia framework such as GStreamer or xine and to provide a stable API for KDE 4's lifetime. It was done to fix problems of frameworks becoming unmaintained, API instability, and to create a simple multimedia API. Phonon is not Unix-specific, and backends can be written for it in order to provide the same functionality on other platforms such as Microsoft Windows. Phonon is not designed to have every conceivable multimedia feature, but rather as a simple way to perform common functions of media players.
Supported backends on Unix-like systems are xine, Gstreamer, VLC and MPlayer.

Xine
http://en.wikipedia.org/wiki/Xine
xine is a multimedia playback engine for Unix-like operating systems released under GPL.  Xine is sometimes used as a library (xine-lib) instead of GStreamer.  GStreamer has superior extensibility and supports a larger variety of media formats.  Xine-lib currently it has better encrypted DVD playback support and can play some files the GStreamer version can't handle.

eXtended Linux Video (XLV)
http://xlv.sourceforge.net/
XLV is intended to be the middle-ware implementation of multimedia streams for Linux. It is extensively based on plugins and dynamic-linking. Features include an OS independent core which is completely multi-threaded to share the complexity of streams into multiple processes. This core is a highly synchronised core designed specifically for combined audio and video. For the moment, XLV is in active development.

Audio Communications Control Protocols

Note that these are just control protocols. They don't transmit audio signals or media. They transmits "event messages" such as the pitch and intensity of musical notes to play, control signals for parameters such as volume, vibrato and panning, cues, and clock signals to set the tempo.

Open Sound Control (OSC)
http://en.wikipedia.org/wiki/OpenSound_Control
OpenSound Control (OSC) is a communication protocol which allows musical instruments (especially electronic musical instruments such as synthesizers), computers, and other multimedia devices to share music performance data in realtime over a network. OSC is meant to supersede the MIDI standard, which was defined in 1983 and which many consider inadequate for modern multimedia purposes. Because it is a networking protocol, OSC allows musical instruments, controllers, and multimedia devices to communicate via a standard home or studio network (TCP/IP, Ethernet) or via the internet. OSC operates at broadband network speeds, allowing new types of realtime interactions which were not possible because of MIDI "lag", although this is usually attributable to factors other than the inherent speed of MIDI propagation. OSC also gives musicians and developers more flexibility in the kinds of data they can send over the wire, enabling new applications which communicate with each other at a higher level.
OSC is also used as the heart of the DSSI plugin API (see in the next section, in order to make the eventual GUI interact with the core of the plugin via messaging the plugin host.

MIDI
http://en.wikipedia.org/wiki/MIDI
MIDI (Musical Instrument Digital Interface) Although now an old standard, as an electronic protocol, it is notable for its widespread adoption throughout the industry.

 

Application Program Interface (API) for streaming plugins

LADSPA plugins
http://www.ladspa.org/
Linux Audio Developer's Simple Plugin API (LADSPA).  A number of these have been converted to GStreamer plugins.

DSSI
http://en.wikipedia.org/wiki/DSSI
An extension of the LADSPA plug-ins to allow it to handle virtual instruments (software synthesizers).
DSSI was designed specifically for instrument plugins that generate sound from note events. DSSI extends LADSPA by adding note event delivery, but it also adds predefined program selections and a method for plugins to provide their own user interfaces, both of which may also be used by effects plugins.

LV2 ("LADSPA Version 2")
http://en.wikipedia.org/wiki/LV2
http://lv2plug.in/
LV2 is a simple but extensible successor of LADSPA, intended to address the limitations of LADSPA which many applications have outgrown.

Generalized Music Plug-in Interface (GMPI)
http://en.wikipedia.org/wiki/Generalized_Music_Plug-in_Interface
An umbrella standard that LADSPA and DSSI etc should be part of. Progress on this has stalled.

 

Lower Level Device Driver Interfaces.

Jack Audio Connection Kit
http://en.wikipedia.org/wiki/JACK_Audio_Connection_Kit
A sound server daemon that provides low latency connections between so-called jackified applications, for both audio and MIDI data. JACK can use ALSA, PortAudio (see Audio APIs), CoreAudio (for Mac OS), FFADO (Free FireWire Audio Drivers) and (still experimental) OSS (Open Sound System) as its back-end. The server is licensed under the GNU GPL, while the library is licensed under the GNU LGPL.

Ecasound
http://en.wikipedia.org/wiki/Ecasound
A hard-disk recording and audio processing tool for Unix-like computer operating systems. Ecasound allows flexible interconnection of audio inputs, files, outputs, and effects algorithms, realtime-controllable by builtin oscillators, MIDI, or interprocess communication via a GUI front-end. Ecasound supports JACK and LADSPA effects plug-ins. Has a GPL License.
Ecasound is a command-line tool: it does not include a native graphical interface. Major tasks (recording, mixdown) can be easily performed directly from the command line interface, or by scripts. Several GUI front-ends have been written for it.

 

Device Drivers or Sets of Device Drivers for connecting to Audio Cards.

 

ALSA
http://www.alsa-project.org/main/index.php/Main_Page
http://en.wikipedia.org/wiki/Advanced_Linux_Sound_Architecture
The Advanced Linux Sound Architecture (ALSA) provides audio and MIDI functionality to the Linux operating system. This is intended to replace the original Open Sound System (OSS). Goals of the ALSA project include automatic configuration of sound-card hardware, and graceful handling of multiple sound devices in a system. This already has a large number of sound cards supported.

Open Sound System (OSS)
http://www.opensound.com/oss.html
OSS is a set of device drivers that provide a uniform API across all the major UNIX architectures. OSS brings the world of MIDI and electronic music to the workstation environment. OSS also provides synchronized audio capabilities required for desktop video and animation playback. A High number of sound cards are supported.

FFADO (Free FireWire Audio Drivers)
http://www.ffado.org/
Open source project which it the successor of the FreeBoB project. The name 'Free FireWire Audio Drivers' says it all.

 

Codecs

The requirement here is to be able to use a lossless high end codec which allows high frequency sampling rates so that we can store the input from the sound card as a file and process it later or send it to someone else who will process it. A comparison of codecs can be found at
http://wiki.hydrogenaudio.org/index.php?title=Lossless_comparison
http://en.wikipedia.org/wiki/Comparison_of_audio_codecs

FLAC
http://flac.sourceforge.net/
http://en.wikipedia.org/wiki/Free_Lossless_Audio_Codec
Free Lossless Audio Codec (FLAC) is a file format for lossless audio data compression. This one is considered the fastest and most widely supported lossless audio codec. FLAC supports only fixed-point samples, not floating-point. It can handle any PCM bit resolution from 4 to 32 bits per sample, any sampling rate from 1 Hz to 655,350Hz in 1 Hz increments, and any number of channels from 1 to 8. Channels can be grouped in cases like stereo and 5.1 channel surround to take advantage of interchannel correlations to increase compression. GStreamer includes a plugin for FLAC.

Wavpack
http://www.wavpack.com/
http://en.wikipedia.org/wiki/WavPack
WavPack also incorporates a "hybrid" mode which still provides the features of lossless compression, but it creates two files: a relatively small, high-quality, lossy file (.wv) that can be used by itself; and a "correction" file (.wvc) that-when combined with the lossy file-provides full lossless restoration. This allows the use of lossy and lossless codecs together. GStreamer includes a plugin for Wavepack.

TTA
http://true-audio.com/Codec.project
True Audio (abbreviated TTA) is a free, real-time lossless audio codec, based on adaptive prognostic filters. Also, .tta is the generic extension to filenames of audio files created by True Audio codec.  TTA performs lossless compression on multichannel 8, 16 and 24 bit data of uncompressed wav input files. The TTA lossless compressed audio format supports both ID3v1 and ID3v2 information tags. GStreamer includes a plugin for TTA.