Language Models¶
-
class
pynlpl.lm.lm.ARPALanguageModel(filename, encoding='utf-8', encoder=None, base_e=True, dounknown=True, debug=False, mode='simple')¶ Full back-off language model, loaded from file in ARPA format.
This class does not build the model but allows you to use a pre-computed one. You can use the tool ngram-count from for instance SRILM to actually build the model.
-
class
NgramsProbs(data, mode='simple', delim=' ')¶ Store Ngrams with their probabilities and backoffs.
This class is used in order to abstract the physical storage layout, and enable memory/speed tradeoffs.
-
backoff(ngram)¶ Return backoff value of a given ngram tuple
-
prob(ngram)¶ Return probability of given ngram tuple
-
-
score(data, history=None)¶
-
scoreword(word, history=None)¶
-
class
-
class
pynlpl.lm.lm.SimpleLanguageModel(n=2, casesensitive=True, beginmarker='<begin>', endmarker='<end>')¶ This is a simple unsmoothed language model. This class can both hold and compute the model.
-
append(sentence)¶
-
load(filename)¶
-
save(filename)¶
-
scoresentence(sentence)¶
-
-
class
pynlpl.lm.srilm.SRILM(filename, n)¶ -
logscore(ngram)¶
-
scoresentence(sentence, unknownwordprob=-12)¶
-
-
exception
pynlpl.lm.srilm.SRILMException¶ Base Exception for SRILM.