pipemath-1.2 Keith M Briggs 2005 Sep 13

Introduction

  This package is intended for processing of very large data sets
  via shell pipelines.   The programs do not store the data.
  They are responses to the challenge: can one perform some of
  the standard computations of statistical data analysis 
  (autocorrelation of a scalar time-series, covariance matrix
  of a set of vectors, and least-squares polynomials) if one
  receives the data points one at a time, and must process them 
  and throw them away before receiving the next data point?
  Of course, all this must be done while preserving numerical
  stability.   The three C programs I provide seem to achieve these
  aims for the three specific problems mentioned.
  The ideas could be relevant more generally to stream computing
  and distributed data analysis; see e.g.
 
     arxiv.org/abs/cs.GR/0310002
     Suresh Venkatasubramanian <suresh@research.att.com>
     The Graphics Card as a Streaming Computer

  Version 1.2 is 64-bit clean.   A new feature is that the covariance
  program takes no arguments.

Quick start

  tar zvxf pipemath-1.2.tgz; cd pipemath-1.2; make

Programs

  autocorrelation

     Computes the autocorrelation function of a scalar time series.
     Usage: cat datafile | autocorrelation [maxlag=20 [stride=1 [dt=1]]]

  covariance

     Computes the covariance matrix of a set of n-vectors
     Usage: cat datafile | covariance
     Each line of datafile has an n-vector.   n is determined by the number
     of columns in the first line.   All lines must have the same number of 
     columns.

  lsqpoly
     
     Fits a least-squares polynomial
     Usage: cat datafile | lsqpoly [degree=1]
     Each line of datafile has an x,y pair and an optional weight

Installation:
   make
   make test
   sudo make install

