EHL – CorpusStudio
Erwin R. Komen
October 2010
Today’s task:
How does SVO versus SOV change in sub clauses?
Our approach:
Step 1: Make a corpus research project
Fill in the “General” tab.
Step 2: Define queries in the “Query” tab.
Name |
Task |
subS+O+V |
subclauses with S,O,V in any order |
subS-O-V |
subclauses with order S…O…V |
subS-V-O |
subclauses with order S…V…O |
Step 3: Put the queries in order using the “Construction” tab.
Step 4: Verify the order using the “Hierarchy” tab.
Step 5: Run the queries using Tools/ExecuteConstructor (F10)
Step 6: Look at the results in the “Results” tab.
Step 7: Copy the results to Excel, and make a graph.
Clause level labels
Constituent |
Examples |
Explanation |
IP |
IP-MAT IP-SUB IP-INF |
Main clause Subclause Infinitival clause |
CP |
CP-THT CP-REL CP-CLF CP-ADV |
That clause Relative clause Cleft Adverbial clause |
Clause constituent labels
Constituent |
Examples |
Explanation |
NP |
NP NP-SBJ NP-NOM NP-GEN NP-OB1 NP-OB2 NP-LFD NP-MSR |
Undetermined NP Subject Nominative NP Genitive case NP Direct object Indirect object Left dislocated NP Measure NP |
PP |
PP PP-LFD PP-RSP |
Any PP Left dislocated PP Resumptive PP |
ADVP |
ADVP |
Adverbial phrase |
Verb labels
Verb type |
Examples |
Explanation |
to be |
BED BEP BE |
Past tense (“was”, “were”) Present tense (“is”, “are”) Infinitive |
to have |
HVD HVP HV |
Past tense Present tense Infinitive |
modals |
MDD MDP MD |
Past tense Present tense Infinitive |
other verbs |
VBD VBP VB |
Past tense Present tense Infinitive |
All label definitions can be found under “Syntactic Labels” at:
http://www-users.york.ac.uk/~lang22/YCOE/YcoeHome.htm
Type |
Examples |
Explanation |
Domination |
x iDoms y x iDomsOnly y x iDomsFirst y x Dominates y |
y is a child of x y is the only child of x y is the first child of x y is a descendant of x |
Order |
x iPrecedes y x iFollows y x Precedes y |
y is the immediately following sibling of x y is the immediately preceding sibling of x y is a sibling following on x (but others may intervene) |
Further query commands can be found at:
http://corpussearch.sourceforge.net/CS-manual/SearchFunctions.html
// --------------------------------------------------------------------
// Name: OE+MEU.def
// Goal: Combined definitions for OE and ME processing
// History:
// 10-07-2009 ERK Combined from Ans van Kemenade
// 18-12-2009 ERK Added definition of "contrast"
// --------------------------------------------------------------------
// --------------------------------------------------------------------------------
// Definitions of different IP categories
// --------------------------------------------------------------------------------
finiteIP: IP-MAT*|IP-SUB*
matrixIP: IP-MAT*
subIP: IP-SUB*
anyCP: CP|CP-*
anyXP: *P-*|*P
negation: NEG-*|NEG
// --------------------------------------------------------------------------------
// Definitions of verbal categories
// --------------------------------------------------------------------------------
nonfiniteverb: *BE|*BAG*|*BEN*|*HV|*HVG*|*HVN*|*AX|*AXG*|*AXN*|*VB|*VAG*|*VAN*| VBN*|VBG*|HAN*|HAG*
finiteverb: BEI|BEP*|BED*|UTP|*HVI|*HVP*|*HVD*|*AXI|*AXP*|*AXD*|*MD|VBI|*VBP*| *VBD*|*DOI|*DOP*|*DOD*|NEG+BEI|NEG+BEP*|NEG+BED*|NEG+AXI|NEG+*AXP*|NEG+*AXD*| NEG+*MD|NEG+VBI|NEG+*VBP*|NEG+*VBD
unaccfiniteverb:BEI|BEP*|BED*|NEG+BEI|NEG+BEP*|NEG+BED*
finite_BE: BEP*|BED*|NEG+BEP*|NEG+BED*
progressive: *ing*|*yng*
// --------------------------------------------------------------------------------
// Definitions of different AP categories
// --------------------------------------------------------------------------------
anypp: PP|PP-*
someap: ADVP-*|ADJP*
timeap: ADVP-TMP*
then_word: then|+ten|+ta|+tonne|than
// --------------------------------------------------------------------------------
// Definitions of different NP categories
// --------------------------------------------------------------------------------
subjectoe: NP-NOM|NP-NOM-#|NP-NOM-RSP
subject: $subjectoe|NP-SBJ*
badsubject: EX
timenp: NP*TMP
anynp: NP|NP-*
leftdisnp: NP-*LFD*
resumpnp: NP-*RSP*
posspro: PRO$|PRO$^*
// --------------------------------------------------------------------------------
// The following definition of an object NP excludes from the list e.g.
// NP-DAT-TMP, NP-GEN-TMP from the list
// --------------------------------------------------------------------------------
object:
NP-OB*|NP-DAT|NP-DAT-[A-SU-Z]*|NP-GEN|NP-GEN-[A-SU-Z]*|NP-ACC|
NP-ACC-[A-SU-Z]*
argument: $object|$subjectoe
objectorpp: $object|PP-*
// --------------------------------------------------------------------------------
// Definitions of contents of NPs
// --------------------------------------------------------------------------------
noun: N-*|NR*|FW|*Q*|D*
dem: D^*|D-*|D
pronoun: PRO^N|PRO^A|PRO^G|PRO^D|PRO|DPRO^N|DPRO^A|DPRO^G|DPRO^D|PRO-*
nonpronominal: D*|ADJ*|N*|*Q*|NUM*|FP|FW|CP*|PTP*|V*|RP+V*|CONJ*
// -------------------------------------------------------------------
// Default values for ignore_nodes and ignore_words
// -------------------------------------------------------------------
// ignore_nodes: COMMENT|CODE|ID|LB|'|\"|,|E_S|.|/|RMV:*
// ignore_words: COMMENT|CODE|ID|LB|'|\"|,|E_S|.|/|RMV:*|0|\**
// -------------------------------------------------------------------