changelog shortlog tags branches changeset file revisions annotate raw help

Mercurial > hg > ventivac / man/2/rabin

revision 144: 207beea1188c
child 152: 350e04c9d1e8
     1.1--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2+++ b/man/2/rabin	Thu Aug 30 14:13:56 2007 +0200
     1.3@@ -0,0 +1,52 @@
     1.4+.TH RABIN 2
     1.5+.SH NAME
     1.6+rabin \- rabin fingerprinting
     1.7+.SH SYNOPSIS
     1.8+.EX
     1.9+include "rabin.m";
    1.10+rabin := load Rabin Rabin->PATH;
    1.11+Rcfg, Rfile: import rabin;
    1.12+
    1.13+init:		fn(bufio: Bufio);
    1.14+open:		fn(rcfg: ref Rcfg, b: ref Iobuf, min, max: int): (ref Rfile, string);
    1.15+
    1.16+Rcfg: adt {
    1.17+	mk:     fn(prime, width, mod: int): (ref Rcfg, string);
    1.18+};
    1.19+
    1.20+Rfile: adt {
    1.21+	read:   fn(r: self ref Rfile): (array of byte, big, string);
    1.22+};
    1.23+.EE
    1.24+.SH DESCRIPTION
    1.25+.B Rabin
    1.26+implements a data fingerprinting algorithm.  A rolling checksum is calculated while reading data.  Certain checksum values are taken to be data boundaries and used for splitting the data into chunks.
    1.27+.PP
    1.28+.B Rcfg
    1.29+represents the parameters to the algorithm,
    1.30+.B Rcfg.mk
    1.31+creates a new instance.
    1.32+.I Prime
    1.33+should be a prime number.
    1.34+.I Width
    1.35+is the width of the rolling checksum window in bytes.  A wider window results in more diverse boundary patterns.  A window of 30 bytes should be reasonable for most uses.
    1.36+.I Mod
    1.37+effectively sets the mean desired chunk size.  The rolling checksum is calculated modulo
    1.38+.IR mod .
    1.39+All three parameters influence where chunk boundaries will be found.
    1.40+.PP
    1.41+.B Rfile
    1.42+represents a file to read chunks from.
    1.43+.B Open 
    1.44+returns an initialised Rfile or an error string.
    1.45+.I Min
    1.46+and
    1.47+.I max
    1.48+are the minimum and maximum size in bytes of chunks that will be returned.  Only the last chunk in a file can be smaller than the minimum chunk size.  Note that the mean chunk size may be off due to these parameters.
    1.49+Data is read from
    1.50+.B Iobuf
    1.51+.IR b .
    1.52+.B Rfile.read
    1.53+returns subsequent chunks of data and the file offset at which they were found, or an error message.  After end of file, the returned chunks are zero bytes long.
    1.54+.SH SOURCE
    1.55+.B /appl/lib/rabin.b