Arithmetic coding is used in most media compression methods. Context modeling usually done through frequency counting and look-up tables (LUTs). For long-memory signals, probability with large context sizes often infeasible. Recently, neural networks have been to model probabilities of contexts order drive arithmetic coders. These trained offline. We introduce an <italic xmlns:mml="http://www.w...