octave: Audio Data Processing

 
 33.5 Audio Data Processing
 ==========================
 
 Octave provides a few functions for dealing with audio data.  An audio
 ‘sample’ is a single output value from an A/D converter, i.e., a small
 integer number (usually 8 or 16 bits), and audio data is just a series
 of such samples.  It can be characterized by three parameters: the
 sampling rate (measured in samples per second or Hz, e.g., 8000 or
 44100), the number of bits per sample (e.g., 8 or 16), and the number of
 channels (1 for mono, 2 for stereo, etc.).
 
    There are many different formats for representing such data.
 Currently, only the two most popular, _linear encoding_ and _mu-law
 encoding_, are supported by Octave.  There is an excellent FAQ on audio
 formats by Guido van Rossum <guido@cwi.nl> which can be found at any FAQ
 ftp site, in particular in the directory
 ‘/pub/usenet/news.answers/audio-fmts’ of the archive site
 ‘rtfm.mit.edu’.
 
    Octave simply treats audio data as vectors of samples (non-mono data
 are not supported yet).  It is assumed that audio files using linear
 encoding have one of the extensions ‘lin’ or ‘raw’, and that files
 holding data in mu-law encoding end in ‘au’, ‘mu’, or ‘snd’.
 
  -- : Y = lin2mu (X, N)
      Convert audio data from linear to mu-law.
 
      Mu-law values use 8-bit unsigned integers.  Linear values use N-bit
      signed integers or floating point values in the range -1 ≤ X ≤ 1 if
      N is 0.
 
      If N is not specified it defaults to 0, 8, or 16 depending on the
      range of values in X.
 
      See also: Seemu2lin XREFmu2lin.
 
  -- : Y = mu2lin (X, N)
      Convert audio data from mu-law to linear.
 
      Mu-law values are 8-bit unsigned integers.  Linear values use N-bit
      signed integers or floating point values in the range -1 ≤ Y ≤ 1 if
      N is 0.
 
      If N is not specified it defaults to 0.
 
      See also: Seelin2mu XREFlin2mu.
 
  -- : record (SEC)
  -- : record (SEC, FS)
      Record SEC seconds of audio from the system’s default audio input
      at a sampling rate of 8000 samples per second.
 
      If the optional argument FS is given, it specifies the sampling
      rate for recording.
 
      For more control over audio recording, use the ‘audiorecorder’
      class.
 
      See also: Seesound XREFsound, Seesoundsc XREFsoundsc.
 
  -- : sound (Y)
  -- : sound (Y, FS)
  -- : sound (Y, FS, NBITS)
      Play audio data Y at sample rate FS to the default audio device.
 
      The audio signal Y can be a vector or a two-column array,
      representing mono or stereo audio, respectively.
 
      If FS is not given, a default sample rate of 8000 samples per
      second is used.
 
      The optional argument NBITS specifies the bit depth to play to the
      audio device and defaults to 8 bits.
 
      For more control over audio playback, use the ‘audioplayer’ class.
 
      See also: Seesoundsc XREFsoundsc, Seerecord XREFrecord.
 
  -- : soundsc (Y)
  -- : soundsc (Y, FS)
  -- : soundsc (Y, FS, NBITS)
  -- : soundsc (..., [YMIN, YMAX])
      Scale the audio data Y and play it at sample rate FS to the default
      audio device.
 
      The audio signal Y can be a vector or a two-column array,
      representing mono or stereo audio, respectively.
 
      If FS is not given, a default sample rate of 8000 samples per
      second is used.
 
      The optional argument NBITS specifies the bit depth to play to the
      audio device and defaults to 8 bits.
 
      By default, Y is automatically normalized to the range [-1, 1].  If
      the range [YMIN, YMAX] is given, then elements of Y that fall
      within the range YMIN ≤ Y ≤ YMAX are scaled to the range [-1, 1]
      instead.
 
      For more control over audio playback, use the ‘audioplayer’ class.
 
      See also: Seesound XREFsound, Seerecord XREFrecord.