Language Models¶
-
class
pynlpl.lm.lm.
ARPALanguageModel
(filename, encoding='utf-8', encoder=None, base_e=True, dounknown=True, debug=False, mode='simple')¶ Full back-off language model, loaded from file in ARPA format.
This class does not build the model but allows you to use a pre-computed one. You can use the tool ngram-count from for instance SRILM to actually build the model.
-
class
NgramsProbs
(data, mode='simple', delim=' ')¶ Store Ngrams with their probabilities and backoffs.
This class is used in order to abstract the physical storage layout, and enable memory/speed tradeoffs.
-
backoff
(ngram)¶ Return backoff value of a given ngram tuple
-
prob
(ngram)¶ Return probability of given ngram tuple
-
-
score
(data, history=None)¶
-
scoreword
(word, history=None)¶
-
class
-
class
pynlpl.lm.lm.
SimpleLanguageModel
(n=2, casesensitive=True, beginmarker='<begin>', endmarker='<end>')¶ This is a simple unsmoothed language model. This class can both hold and compute the model.
-
append
(sentence)¶
-
load
(filename)¶
-
save
(filename)¶
-
scoresentence
(sentence)¶
-
-
class
pynlpl.lm.srilm.
SRILM
(filename, n)¶ -
logscore
(ngram)¶
-
scoresentence
(sentence, unknownwordprob=-12)¶
-
-
exception
pynlpl.lm.srilm.
SRILMException
¶ Base Exception for SRILM.