Multi-scale N-Gram Expectation Model

From Emcap


Multi-scale N-Gram Expectation Model

by Amaury Hazan <>

Pompeu Fabra University


Multi-scale N-Gram Expectation Code

What is the Module?

This is a module of symbolic sequential learning based on Prediction by Partial Match [Moffat1990]. Based on the observation of a stream of symbols, the model compute the posterior distribution of the next symbol to appear, based on different context sizes.

What are the inputs and outputs?

Inputs: A sequence of symbol encoded with numbers, eg: [0,1,2,3,0,1,2,3]

Outputs: The posterior distribution of the next symbol and/or the symbol which maximizes this posterior distribution

What are the parameters?

The parameters are defined when creating the object in python. They are the following:

  • nsymbols (integer) : Numbers of symbols to use
  • maxstepsize (integer): Maximum context size to be considered
  • sparse (boolean, default False): whether to use a sparse matrix implementation (scipy dependy needed)
  • updateexclusion (boolean, default True): whether to use a update exclusion scheme (see [Moffat1990]). It usually works better
  • dosample (boolean, default False): whether to provide a prediction of the next symbol by doing sampling on the posterior probability
  • berandom (boolean, default False): whether to produce a completely random prediction (for testing purposes)

What are the system requirements?

  • Python 2.4 or higher:

Follow the instructions at:

  • scipy 0.6.0 or higher:

Follow the instructions at:

  • numpy 1.0.4 or higher:

Follow the instruction at:

How to Cite

 Hazan, A., Marxer, R., Brossier, P., Purwins, H., Herrera, P., Serra, X. (2009).  
 What/when causal expectation modelling applied to audio signals.  Connection Science. 


 @article {hazan2009,
 	title = {What/when causal expectation modelling applied to audio signals},
 	journal = {Connection Science},
 	year = {2009},
 	URL = {},
 	author = {Hazan, A. and Marxer, R. ;Brossier, P. and Purwins, H.;Herrera, P. and Serra, X.}

How to Use, example file

 # import the ppm module
 import ppm
 # create the expectation object
 # create the sequence to be attended
 # iterating over the sequence
 for symbol in seq:
   # updating expectation object with new symbol
   # computing posterior (optional)
   print "posterior probability:", myExp.get_cpt_now()
   # predicting next symbol which maximizes cpt
   print "expected next symbol", myExp.expect()