S-rule: Difference between revisions
From UNLwiki
				
				
				Jump to navigationJump to search
				
				
| imported>Martins No edit summary | imported>Martins No edit summary | ||
| Line 96: | Line 96: | ||
| ;Nodes are always indexed in S-rules | ;Nodes are always indexed in S-rules | ||
| :Indexes (%) are used for indexing nodes, attributes and values inside and between the left (condition) and the right side of rules.   | :Indexes (%) are used for indexing nodes, attributes and values inside and between the left (condition) and the right side of rules.   | ||
| :*X(%a;%b)Y(%a;%c); (the head of X is also the head of Y) | :*X('''%a''';'''%b''')Y('''%a''';'''%c'''); (the head of X is also the head of Y) | ||
| ;Indexes as variables | ;Indexes as variables | ||
| :Indexes are features and may be used as variables | :Indexes are features and may be used as variables | ||
| :*X(%a;%b)Y(%a;%c):=Z(%b;%c); (if the head of the relation X is the head of the relation Y, delete X and Y and create Z between the arguments of X and Y) | :*X('''%a''';'''%b''')Y('''%a''';'''%c'''):=Z('''%b''';'''%c'''); (if the head of the relation X is the head of the relation Y, delete X and Y and create Z between the arguments of X and Y) | ||
| :*X(%a,A;%b,B):=X(%a;+C,-B); (add the feature C to the argument of X and remove the feature B from it if the head of X has the feature A)   | :*X('''%a''',A;'''%b''',B):=X('''%a''';'''%b''',+C,-B); (add the feature C to the argument of X and remove the feature B from it if the head of X has the feature A)   | ||
| ;If omitted, indexes are assigned by default, according to the position: | ;If omitted, indexes are assigned by default, according to the position: | ||
| :*X(A;B)Y(C;D)Z(E;F); is the same as X(A,%01;B,%02)Y(C,%03;D,%04)Z(E,%05;F,%06); | :*X(A;B)Y(C;D)Z(E;F); is the same as X(A,'''%01''';B,'''%02''')Y(C,'''%03''';D,'''%04''')Z(E,'''%05''';F,'''%06'''); | ||
| :*X(A;B):=X(;+C,-B); is the same as X(A,%01;B,%02):=X(%01;+C,-B,%02); | :*X(A;B):=X(;+C,-B); is the same as X(A,'''%01''';B,'''%02'''):=X('''%01''';+C,-B,'''%02'''); | ||
| :*X(A;B):=X(+C,-B); is the same as X(A,%01;B,%02):=X(%01;+C,-B,%02); (same as above: the relation is automatically extended if the head is empty) | :*X(A;B):=X(+C,-B); is the same as X(A,'''%01''';B,'''%02'''):=X('''%01''';+C,-B,'''%02'''); (same as above: the relation is automatically extended if the head is empty) | ||
| :However | :However | ||
| :*X(A;B)Y(A;C):=Z(B;C); is different from X(%a;%b)Y(%a;%c):=Z(%b;%c); | :*X(A;B)Y(A;C):=Z(B;C); is different from X(%a;%b)Y(%a;%c):=Z(%b;%c); | ||
| ::*X(A;B)Y(A;C):=Z(B;C);  | ::*X(A;B)Y(A;C):=Z(B;C);       is the same as X(A,'''%01''';B,'''%02''')Y(A,'''%03''';C,'''%04'''):=Z(B,'''%01''';C,'''%02''');  while | ||
| ::*X(%a;%b)Y(%a;%c):=Z(%b;%c); is the same as X(%01;%02)Y(%01;%04):=Z(%02;%04); | ::*X(A,%a;B,%b)Y(A,%a;C,%c):=Z(B,%b;C,%c); is the same as X(A,'''%01'''  ;B,'''%02''')Y(A,'''%01''';C,'''%04'''):=Z(B,'''%02''';C,'''%04'''); | ||
| ::In the first case, the feature B is added to the head of X and the feature C is added to its argument; the relation Y is deleted. In the second case, the feature C is added to the argument of Y, and Z is made between the arguments of X and Y> | |||
| ;If omitted, right side indexes are automatically co-indexed with the left side ones: | ;If omitted, right side indexes are automatically co-indexed with the left side ones: | ||
| :*X(;):=Y(;);  | :*X(;):=Y(;); is the same as X('''%01''';'''%02'''):=Y('''%01''';'''%02'''); | ||
| ;Right side indexes are to explicitly defined if order is to be altered | ;Right side indexes are to explicitly defined if order is to be altered | ||
| :*X(;):=Y(%02;%01);   | :*X(;):=Y(%02;%01);   | ||
Revision as of 07:56, 26 March 2010
S-rule (syntactic rule) is the formalism used for describing syntactic structures and syntactic operations in the UNLarium framework.
When to use S-rules
S-rules are used for:
- composition, i.e., creating compounds out of the base forms (such as "take">"take into account");
- periphrasis, i.e., generating analytic grammatical structures, such as in ("love">"will love")
- subcategorization, i.e., defining the number and the type of arguments of a given base form;
- case marking, i.e., defining the grammatical cases of the arguments of a given base form;
- agreement, i.e., concord between different parts of a phrase;
- distribution, i.e., defining the precedence of word forms;
- adjacency, i.e., defining the distance between syntactic branches; and
- projection, i.e., projecting syntactic structures out of the constituents.
When not to use S-rules
S-rules are not used for for affixation (prefixation, infixation, suffixation) or spelling changes, which must be addressed by A-rules and Ph-rules, respectively.
Types of S-rules
There are four types of S-rules:
- Change
<CONDITION> := <RELATION>;
- Change the attributes of the constituents of the relation. The relation itself is not affected. Features are added through "+" and deleted through "-".
- VA(%head;%adjt):=VA(%head,+C;%adj,-D); (add the feature C to the head and remove the feature D from the adjunct)
 
- Create
<CONDITION> := +<RELATION>;
- Create a new relation. Nodes to be created must be defined as strings (between quotes) or lemmas (between brackets), if not co-indexed to an existing node.
- VA(%head;%adjt):=+VC(%head;"c"); (add the relation VC between the head and "c", which is created.)
 
- Delete
<CONDITION> := -<RELATION>;
- Delete a relation between the head and the argument. The head and the argument are not deleted.
- VA(%head;%adjt):=-VA(%head;%adjt); (delete the relation VA between the head and its arguments. The nodes are not deleted)
 
- Replace
<RELATION> := <RELATION>;
- Replace the relation in the left side by the one in the right side
- VA(%head;%any):=VC(%head;%any); (replace the relation VA by VC)
 
- Two special cases of replacement are
- Merge
- <RELATION><RELATION> := <RELATION>;
- Replace the relations in the left side by the ones in the right side.
- VA(%head;%adjt)VC(%head;%comp):=VB(VB(%head;%adjt);%comp); (VA and VC are deleted, and VB is created)
 
 - Divide
- <RELATION> := <RELATION><RELATION>;
- Replace the relation in the left side by those in the right side.
- VA(%head;%adjt):=VC(%head;%x)VC(%head;%y); (VC is deleted, and the two VAs are created)
 
 
Where:
- <CONDITION> (to be repeated 0 or more times) may be a tag or a <RELATION> that defines when the rule is applied. It may be empty in general cases (i.e., if the rule is always applied).
- <RELATION> (to be repeated 1 or more times) is a syntactic relation containing the <HEAD>, in case of head-only relations (VH, NH, JH, PH, IH, CH, AH, DH), or the <HEAD> and <ARGUMENT> (i.e, complement, adjunct or specifier), in case of binary relations (VA, VC, VS, VB, NA, NC, NS, etc).
- <HEAD> and <ARGUMENT> may be expressed as
- a "string" (strings come between parentheses);
- a [lemma] (lemmas come between square brackets);
- a feature or a set of features, separated by comma, and extracted from the UNDLF Tagset;
- an index;
- an action, to be performed by adding features (through "+"), deleting features (through "-"), or through the right side of an A-rule (i.e., prefixation, suffixation, infixation); or
- a <RELATION> itself (i.e., rules may be recursive).
 
Observations
- The <CONDITION> field may be empty in change, create and delete rules, in case of unconditional change, creation or deletion. It is obligatory in replace rules
- 
- VA(+C); (add the feature C to all adjuncts to the head in the verbal phrase)
- +VA("a"); (add an adjunct "a" to the head of the verbal phrase, whatever the case)
- -VA(C); (delete all adjuncts to the head of the verbal phrase that have the feature C)
 
- The <HEAD> and the <ARGUMENT> may be empty in case of no change. Empty heads are automatically extended
- Binary relations (?A, ?S, ?C)
- VA; (no head nor argument: the relation is automatically extended to "VA(;);" )
- VA(); (same as above)
- VA(;); (same as above)
- VA("a"); (argument only: the relation is automatically extended to "VA(;"a");" )
- VA("a";); (head only)
- VA("a";"b"); (head and argument)
 
- Unary relations (?H)
- VH; (no head: the relation is automaticall extended to "VH();" )
- VH(); (same as above)
- VH("a"); (head)
 
- Relations are always juxtaposed (they must not be separated by ",")
- VS("b")VC("c")VA("d");
- VS("b"),VC("c"),VA("d");
- Order is not important between relations, but essential between constituents of the same relation
- VS("b")VC("c")VA("d") = VC("c")VA("d")VS("b") = VA("d")VC("c")VS("b")
- VA("a";"b"); ≠ VA("b";"a");
- Arguments of relations may be expressed by A-rules, but only in the right side of rules
- VA(0>"a"); (the verbal adjuncts, if any, receive an "a" as suffix)
- Rules are conservative. Features will be preserved unless explicitly deleted (through "-")
- VC(%comp,ACC):=VC(%comp,NOM); (is the same as "VC(%comp,ACC):=VC(%comp,+NOM);" i.e., add the feature "NOM" to the complements of verb that have the feature "ACC"; the feature "ACC" will be preserved and not replaced by "NOM")
- VC(%comp,%ACC):=VC(%comp,NOM,-ACC); (add the feature "NOM" and delete the feature "ACC" from the complements of the verb that have the feature "ACC")
- A node may have as many features as necessary, but one single string or lemma
- VC(%comp,"a"):=VC(%comp,"b"); ("a" is replaced by "b")
- Strings are represented between quotes if invariable, or between brackets if variable (lemmas)
- VA("into account"); (the string "into account" is an adjunct to the verb)
- IH([be]); (the lemma "be" is the head of inflectional phrase)
- Negation
- "^" is used for negation, and may be applied over features, strings or relations:
- VA(^NOU); (if the adjunct does not have the feature "NOU")
- VA(^"a"); (if the adjunct is not the string "a")
- ^VA("a"); (if there is no VA relation between the head and "a")
 
- S-rules always end in ";"
- 
- VA("a");
- VA("a")
 
Indexes
- Nodes are always indexed in S-rules
- Indexes (%) are used for indexing nodes, attributes and values inside and between the left (condition) and the right side of rules.
- X(%a;%b)Y(%a;%c); (the head of X is also the head of Y)
 
- Indexes as variables
- Indexes are features and may be used as variables
- X(%a;%b)Y(%a;%c):=Z(%b;%c); (if the head of the relation X is the head of the relation Y, delete X and Y and create Z between the arguments of X and Y)
- X(%a,A;%b,B):=X(%a;%b,+C,-B); (add the feature C to the argument of X and remove the feature B from it if the head of X has the feature A)
 
- If omitted, indexes are assigned by default, according to the position
- 
- X(A;B)Y(C;D)Z(E;F); is the same as X(A,%01;B,%02)Y(C,%03;D,%04)Z(E,%05;F,%06);
- X(A;B):=X(;+C,-B); is the same as X(A,%01;B,%02):=X(%01;+C,-B,%02);
- X(A;B):=X(+C,-B); is the same as X(A,%01;B,%02):=X(%01;+C,-B,%02); (same as above: the relation is automatically extended if the head is empty)
 
- However
- X(A;B)Y(A;C):=Z(B;C); is different from X(%a;%b)Y(%a;%c):=Z(%b;%c);
 - X(A;B)Y(A;C):=Z(B;C); is the same as X(A,%01;B,%02)Y(A,%03;C,%04):=Z(B,%01;C,%02); while
- X(A,%a;B,%b)Y(A,%a;C,%c):=Z(B,%b;C,%c); is the same as X(A,%01 ;B,%02)Y(A,%01;C,%04):=Z(B,%02;C,%04);
 
- In the first case, the feature B is added to the head of X and the feature C is added to its argument; the relation Y is deleted. In the second case, the feature C is added to the argument of Y, and Z is made between the arguments of X and Y>
 
- If omitted, right side indexes are automatically co-indexed with the left side ones
- 
- X(;):=Y(;); is the same as X(%01;%02):=Y(%01;%02);
 
- Right side indexes are to explicitly defined if order is to be altered
- 
- X(;):=Y(%02;%01);
 
- Indexes can be replaced by user-defined labels made of any sequence of alphabetic characters and underscore
- X(A,%a;B,%b)Y(C,%c;D,%d)Z(E,%e;F,%f)
- %01 = A, %02 = B, %03 = C, %04 = D, %05 = E, %06 = F and
- %a = A, %b = B, %c = C, %d = D, %e = E, %f = F
 
- Numeric characters cannot be used as user-defined indexes
- X(A,%03;B,%05)
- %01 = A, %02 = B (there is no %03 nor %05)
 
- To avoid ambiguities, users are strongly recommended to replace default values by customized labels
- 
- X(A,%a;B,%b)
 - instead of simply X(A;B) or X(A,%01;B,%02)
 
- In case of sub-nodes, the parent node must be informed by the syntax <PARENT NODE><CHILD NODE>, where <PARENT NODE> may be, itself, a sub-node
- X(Y(A;B);C)
- %01 = Y(A;B), %02 = C, %01%01 = A, %01%02 = B
 
- X(Y(Z(A;B);C);D)
- %01 = Y(Z(A;B);C), %02 = D, %01%01 = Z(A;B), %01%02 = C, %01%01%01 = A, %01%01%02 = B
 
- Indexation is not affected by repetition
- X(A;B)Y(A;C)Z(A;D)
- %01 = A, %02 = B, %03 = A, %04 = C, %05 = A, %06 = D (and %01 = %03 = %05)
 
- Empty nodes are also indexed
- X(;)
- %01 = first node of X, %02 = second node of X
 
- Indexes may be used both in the left and in the right side of rules
- X(%a;%b):=Y(%b;%a); (the first node of the X relation becomes the second node of the Y relation)
- X(%a;)Y(%a;):=Z(%a); (if the first node of the X relation is the first node of the Y relation then make it the single node of a Z relation)
- Indexes may also be used to transfer attribute values expressed in the format ATTRIBUTE=VALUE
- X(A,%a,ATT1=VAL1;B,%b):=X(%a;%b,ATT1=%a); (the value "VAL1" of "ATT1" of %a is copied to the node %b)
Examples
Examples of S-rules:
- composition
- VA("into account"); (add the string "into account" as the adjunct of the verb)
 
- subcategorization
- VC(PH("in")); (the complement of the verb is a prepositional phrase headed by the preposition "in")
 
- agreement
- VS(ANUM,APER); (the specifier of the verb assigns number (ANUM) and person (APER) to its head
 
- case marking
- VS(NOM); (the specifier of the verb receives the case nominative (NOM)
 
- distribution
- VA(>>); (the adjunct of the verb comes at the right side of the verb after a blank space)
 
- adjacency
- VA(AJ2); (the adjunct of the verb integrates the second projection of the head)
 
- projection
- VS(%head;%spec)VB(%head;%comp):=VP(VB(%head;%comp);%spec); (integrate the two relations on the left side into a single relation)
 
Formal Syntax
S-rules comply with the following formal syntax:
<S-RULE>                ::= <CONDITION> ":=" (<SYNTACTIC RELATION>)+";"
<CONDITION>             ::= <TAG>(","<TAG>)* | (<SYNTACTIC RELATION>)*
<SYNTACTIC RELATION>    ::= <HEAD-DRIVEN RELATION> "(" (<NODE>";")? <NODE> ")"
<HEAD-DRIVEN RELATION>  ::= {one of the head-driven syntactic relations defined in the UNDLF Tagset} 
<NODE>                  ::= <FEATURE>(","<FEATURE>)* 
<FEATURE>               ::= <ID>|<TAG>|"""<STRING>"""|"["<STRING>"]"|<DIRECTION>|<SYNTACTIC RELATION>|<ACTION>
<ID>                    ::= "%"[a-zA-Z_0-9]+
<TAG>                   ::= {one of the tags defined in the UNDLF Tagset}
<STRING>                ::= [a..Z]+
<DIRECTION>             ::= ">"|">>"|"<"|"<<"
<ACTION>                ::= <PREFIXATION> | <SUFFIXATION> | <INFIXATION> | <REPLACEMENT> (cf. A-rule)
where
<a> = a is a non-terminal symbol
"a" = a is a constant
a | b = a or b
(a)? = a can be repeated 0 or one time
(a)* = a can be repeated 0 or more times
(a)+ = a can be repeated 1 or more times