Title: Efficient Language Model Inference
Who? Kenneth Heafield
When? Tuesday, December 21 @ Noon
Where? GHC 4405
In GHC 4405 at noon on Tuesday Dec 21, I will give a speaking
requirement talk on Efficient Language Model Inference. As this is also
a MT Lunch, there will be free lunch.
If you're using SRILM, come to my talk to reduce your memory consumption
by 86% while reducing CPU time by 16%. Users of IRSTLM should come for
the same reason; the code uses 42% less memory and 19% less CPU.
Language models are an important feature in speech, translation,
generation, IR, and other technologies. More training data and less
pruning generally lead to higher quality, but RAM is a limiting factor.
Further, systems consult language models so frequently that lookups
dominate CPU time.
This talk presents language modeling code with several optimizations to
improve time and space performance. Storing backoff information in
feature state reduces redundant lookups. Constructing known
distributions and biasing binary search speeds search and reduces page
faults. Memory mapping reduces load time. Bit level packing increases
locality. Stronger filtering removes n-grams that cannot be assembled
during decoding due to phrase and sentence constraints. The code is
currently integrated into Moses and being integrated into cdec and
Joshua. I will cover how my code works and how to use it in other