The Szeged Treebank
Dóra Csendes, János Csirik, Tibor Gyimóthy,
András Kocsor
The major aim of the Szeged Treebank project was to
create a high-quality database of syntactic structures for Hungarian
that can serve as a golden standard to further research in linguistics
and computational language processing. The treebank currently contains
full syntactic parsing of about 82,000 sentences, which is the result
of accurate manual annotation. Current paper describes the linguistic
theory as well as the actual method used in the annotation process.
In addition, the application of the treebank for the training of automated
syntactic parsers is also presented.