×

Non-sequential recursive pair substitution: some rigorous results. (English) Zbl 1456.94036

Summary: We present rigorous results on some open questions on NSRPS, the non-sequential recursive pairs substitution method. In particular, starting from the action of NSRPS on finite strings we define a corresponding natural action on measures and we prove that the iterated measure becomes asymptotically Markov. This certifies the effectiveness of NSRPS as a tool for data compression and entropy estimation.

MSC:

94A29 Source coding
94A17 Measures of information, entropy
94A40 Channel models (including quantum) in information and communication theory
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Abarbanel H D I 1996 Analysis of Observed Chaotic Data (New York: Springer) · Zbl 0890.93006 · doi:10.1007/978-1-4612-0763-4
[2] Adamek J 1991 Foundations of Coding (New York: Wiley) · Zbl 0747.94001 · doi:10.1002/9781118033265
[3] Argenti F, Benci V, Cerrai P, Cordelli A, Galatolo S and Menconi G 2002 Information and dynamical systems: a concrete measurement on sporadic dynamics. Classical and quantum complexity and non-extensive thermodynamics (Denton, TX, 2000) Chaos Solitons Fractals13 461 · Zbl 1001.37013 · doi:10.1016/S0960-0779(01)00028-5
[4] Azad R K, Bernaola-Galván P, Ramaswamy R and Rao J S 2002 Segmentation of genomic DNA through entropic divergence: Power laws and scaling Phys. Rev. E 65 051909 · doi:10.1103/PhysRevE.65.051909
[5] Baronchelli A, Caglioti E and Loreto V 2005 Artificial sequences and complexity measures J. Stat. Mech. P04002 · Zbl 1456.94030
[6] Ebeling W and Jiménez-Montaño M A 1980 On grammars, complexity, and information measures of biological macromolecules Math. Biosci.52 53 · Zbl 0451.92004 · doi:10.1016/0025-5564(80)90004-8
[7] Jiménez-Montaño M A 1984 On the syntactic structure of protein sequences and the concept of grammar complexity Bull. Math. Biosci.42 641 · Zbl 0552.92008 · doi:10.1007/BF02459508
[8] Jiménez-Montaño M A, Ebeling W and Pöschel T 2002 SYNTAX: A computer program to compress a sequence and to estimate its information content Preprint cond-mat/0204134
[9] Rapp P E, Zimmermann I D, Vining E P, Cohen N, Alabano A M and Jiménez-Montaño M A The algorithmic complexity of neural spike trains increases during focal seizures 1994 J. Neurosci.14 4731 · doi:10.1523/JNEUROSCI.14-08-04731.1994
[10] Rapp P E, Zimmermann I D, Vining E P, Cohen N, Alabano A M and Jiménez-Montaño M A 1994 Phys. Lett. A 192 27 · doi:10.1016/0375-9601(94)91010-3
[11] Fukuda K, Stanley H E and Nunes Amaral L A 2004 Heuristic segmentation of a nonstationary time series Phys. Rev. E 69 021108 · doi:10.1103/PhysRevE.69.021108
[12] Kantz H and Schreiber T 1997 Nonlinear Time Series Analysis(Cambridge Nonlinear Science Series) (Cambridge: Cambridge University Press) · Zbl 0873.62085
[13] Grassberger P 2002 Data compression and entropy estimates by non-sequential recursive pair substitution Preprint physics/0207023
[14] Grosse I, Bernaola-Galván P, Carpena P, Román-Roldán R, Oliver J and Stanley H E 2002 Analysis of symbolic sequences using the Jensen-Shannon divergence Phys. Rev. E 65 041905 · Zbl 1245.94057 · doi:10.1103/PhysRevE.65.041905
[15] Mantegna R N, Buldyrev S V, Goldberger A L, Havlin S, Peng C K, Simons M and Stanley H E 1994 Linguistic features of noncoding DNA sequences Phys. Rev. Lett.73 3169 · doi:10.1103/PhysRevLett.73.3169
[16] Mertens S and Bauke H 2004 Entropy of pseudo-random-number generators Phys. Rev. E 69 055702 · doi:10.1103/PhysRevE.69.055702
[17] Puglisi A, Benedetto D, Caglioti E, Loreto V and Vulpiani A 2003 Data compression and learning in time sequences analysis Physica D 180 92 · Zbl 1094.68567 · doi:10.1016/S0167-2789(03)00047-2
[18] Schuermann T and Grassberger P 1996 Entropy estimation of symbol sequences Chaos6 414 · Zbl 1055.94508 · doi:10.1063/1.166191
[19] Shields P C 1996 The Ergodic Theory of Discrete Sample Paths(Graduate Studies in Mathematics vol 13) (Providence, RI: American Mathematical Society) · Zbl 0879.28031 · doi:10.1090/gsm/013
[20] Welch T A 1984 A technique for high performance data-compression Computer17 8 · doi:10.1109/MC.1984.1659158
[21] Ziv J and Lempel A 1977 A universal algorithm for sequential data compression IEEE Trans. Inf. Theory23 337 · Zbl 0379.94010 · doi:10.1109/TIT.1977.1055714
[22] Ziv J and Lempel A 1978 Compression of individual sequences via variable-rate coding IEEE Trans. Inf. Theory24 530 · Zbl 0392.94004 · doi:10.1109/TIT.1978.1055934
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.