Cesax homepage

 

“Cesax” is a computer program that handles several different types of syntactically parsed files. It allows semi-automatic coreference resolution for its own native psdx format. Cesax can handle files of the following types:

 

1.      Psd     - Text files that are part of the parsed series of English corpora (YCOE, PPCME2 etc)

2.      Psdx   - XML equivalents of psd files

3.      Tig      - Negra (Tiger) format (stand-off xml).

4.      Xml    - Alpino-produced xml format

 

v  Conversion options

Ø  Treebank psd files to psdx:

§  Individual psd files: use “Cesax” (File/Import)

§  Batch conversion program “TreebankToXml” (see below).

§  Use the “Cesax” program (Tools/BatchConvertToPsdx)

Ø  Treebank psdx files to psd:

§  Use “Cesax” (Tools/ProducePsd or Tools/ProduceSimplePsd)

Ø  Different language text files to psd:

§  English: use “Cesax” (Tools/EnglishTxtToPsd)

§  German: use “Cesax” (Tools/GermanTxtToPsd)

§  Dependency: use “Cesax” (Tools/ConllXtoPsdx)

 

v  Documentation

Ø  Paper on Cesax (submitted to the International Journal of Corpus Linguistics)

Ø  Cesax manual (htm, pdf)

Ø  Helsinki corpus festival (handout)

v  Download

Ø  Install Cesax

v  Descriptions of formats used

Ø  Psdx format (xsd)

Ø  Referential chain dictionary (xsd)

 

 

 

 

Comments: erwin.komen@ru.nl

 

Version information:

Date

Version

Comments

Jan/2012

Version information is now available under “Help” in Cesax

15/Nov/2011

1.4.3.2

Facilitate automatic download of period definitions file from CorpusStudio website

3/Nov/2011

1.4.3.0

Added “Tree” tab

15/sep/2011

1.4.2.16

Resolved error with wildcards in chain dictionary

9/sep/2011

1.4.2.14

Improved calculation of SubjecSwitch using coreferential chains

17/aug/2011

1.4.2.9

Added feature Reference/List_Coreferential_Chains (see manual)

13/jul/2011

1.4.2.0

Repaired bug in ChainDictHas

Added feature View/Text