Package nltk_lite :: Package tag :: Module ngram :: Class Ngram
[hide private]
[frames] | no frames]

Class Ngram

source code

     object --+            
              |            
yaml.YAMLObject --+        
                  |        
               TagI --+    
                      |    
      SequentialBackoff --+
                          |
                         Ngram
Known Subclasses:
contrib.marshal.MarshalNgram, Bigram, Trigram

An n-gram stochastic tagger. Before an tagger.Ngram can be used, it should be trained on a tagged corpus. Using this training data, it will construct a frequency distribution describing the frequencies with each word is tagged in different contexts. The context considered consists of the word to be tagged and the n-1 previous words' tags. Once the tagger has been trained, it uses this frequency distribution to tag words by assigning each word the tag with the maximum frequency given its context. If the tagger.Ngram encounters a word in a context for which it has no data, it will assign it the tag None.

Nested Classes [hide private]

Inherited from yaml.YAMLObject: __metaclass__, yaml_dumper, yaml_loader

Instance Methods [hide private]
 
__init__(self, n, cutoff=1, backoff=None)
Construct an n-gram stochastic tagger.
source code
 
train(self, tagged_corpus, verbose=True)
Train this tagger.Ngram using the given training data.
source code
 
tag_one(self, token, history=None) source code
 
size(self) source code
 
__repr__(self)
repr(x)
source code

Inherited from SequentialBackoff: tag, tag_sents

Inherited from SequentialBackoff (private): _backoff_tag_one

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __str__

Class Methods [hide private]

Inherited from yaml.YAMLObject: from_yaml, to_yaml

Class Variables [hide private]

Inherited from yaml.YAMLObject: yaml_flow_style, yaml_tag

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, n, cutoff=1, backoff=None)
(Constructor)

source code 

Construct an n-gram stochastic tagger. The tagger must be trained using the train() method before being used to tag data.

Parameters:
  • n (int) - The order of the new tagger.Ngram.
  • cutoff (int) - A count-cutoff for the tagger's frequency distribution. If the tagger saw fewer than cutoff examples of a given context in training, then it will return a tag of None for that context.
Overrides: object.__init__

train(self, tagged_corpus, verbose=True)

source code 

Train this tagger.Ngram using the given training data.

Parameters:
  • tagged_corpus (list or iter(list)) - A tagged corpus. Each item should be a list of tagged tokens, where each consists of text and a tag.

__repr__(self)
(Representation operator)

source code 

repr(x)

Overrides: object.__repr__
(inherited documentation)