pynlpl.formats.folia.Reader

class pynlpl.formats.folia.Reader(filename, target, *args, **kwargs)

Bases: object

Streaming FoLiA reader.

The reader allows you to read a FoLiA Document without holding the whole tree structure in memory. The document will be read and the elements you seek returned as they are found. If you are querying a corpus of large FoLiA documents for a specific structure, then it is strongly recommend to use the Reader rather than the standard Document!

Method Summary

__init__(filename, target, *args, **kwargs) Read a FoLiA document in a streaming fashion.
findwords(*args, **kwargs)
initdoc()

Method Details

__init__(filename, target, *args, **kwargs)

Read a FoLiA document in a streaming fashion. You select a specific target element and all occurrences of this element, including all contents (so all elements within), will be returned.

Parameters:
  • filename (*) – The filename of the document to read
  • target (*) – The FoLiA element(s) you want to read (with everything contained in its scope). Passed as a class. For example: folia.Sentence, or a tuple of multiple element classes. Can also be set to None to return all elements, but that would load the full tree structure into memory.
__init__(filename, target, *args, **kwargs)

Read a FoLiA document in a streaming fashion. You select a specific target element and all occurrences of this element, including all contents (so all elements within), will be returned.

Parameters:
  • filename (*) – The filename of the document to read
  • target (*) – The FoLiA element(s) you want to read (with everything contained in its scope). Passed as a class. For example: folia.Sentence, or a tuple of multiple element classes. Can also be set to None to return all elements, but that would load the full tree structure into memory.
findwords(*args, **kwargs)
initdoc()