S-rule: Difference between revisions

Revision as of 07:56, 26 March 2010

S-rule (syntactic rule) is the formalism used for describing syntactic structures and syntactic operations in the UNL^arium framework.

When to use S-rules

S-rules are used for:

composition, i.e., creating compounds out of the base forms (such as "take">"take into account");
periphrasis, i.e., generating analytic grammatical structures, such as in ("love">"will love")
subcategorization, i.e., defining the number and the type of arguments of a given base form;
case marking, i.e., defining the grammatical cases of the arguments of a given base form;
agreement, i.e., concord between different parts of a phrase;
distribution, i.e., defining the precedence of word forms;
adjacency, i.e., defining the distance between syntactic branches; and
projection, i.e., projecting syntactic structures out of the constituents.

When not to use S-rules

S-rules are not used for for affixation (prefixation, infixation, suffixation) or spelling changes, which must be addressed by A-rules and Ph-rules, respectively.

Types of S-rules

There are four types of S-rules:

Change

<CONDITION> := <RELATION>;

Change the attributes of the constituents of the relation. The relation itself is not affected. Features are added through "+" and deleted through "-".

VA(%head;%adjt):=VA(%head,+C;%adj,-D); (add the feature C to the head and remove the feature D from the adjunct)

Create

<CONDITION> := +<RELATION>;

Create a new relation. Nodes to be created must be defined as strings (between quotes) or lemmas (between brackets), if not co-indexed to an existing node.

VA(%head;%adjt):=+VC(%head;"c"); (add the relation VC between the head and "c", which is created.)

Delete

<CONDITION> := -<RELATION>;

Delete a relation between the head and the argument. The head and the argument are not deleted.

VA(%head;%adjt):=-VA(%head;%adjt); (delete the relation VA between the head and its arguments. The nodes are not deleted)

Replace

<RELATION> := <RELATION>;

Replace the relation in the left side by the one in the right side

VA(%head;%any):=VC(%head;%any); (replace the relation VA by VC)

Two special cases of replacement are

Merge

<RELATION><RELATION> := <RELATION>;

Replace the relations in the left side by the ones in the right side.

VA(%head;%adjt)VC(%head;%comp):=VB(VB(%head;%adjt);%comp); (VA and VC are deleted, and VB is created)

Divide

<RELATION> := <RELATION><RELATION>;

Replace the relation in the left side by those in the right side.

VA(%head;%adjt):=VC(%head;%x)VC(%head;%y); (VC is deleted, and the two VAs are created)

Where:

<CONDITION> (to be repeated 0 or more times) may be a tag or a <RELATION> that defines when the rule is applied. It may be empty in general cases (i.e., if the rule is always applied).
<RELATION> (to be repeated 1 or more times) is a syntactic relation containing the <HEAD>, in case of head-only relations (VH, NH, JH, PH, IH, CH, AH, DH), or the <HEAD> and <ARGUMENT> (i.e, complement, adjunct or specifier), in case of binary relations (VA, VC, VS, VB, NA, NC, NS, etc).
<HEAD> and <ARGUMENT> may be expressed as
- a "string" (strings come between parentheses);
- a [lemma] (lemmas come between square brackets);
- a feature or a set of features, separated by comma, and extracted from the UNDLF Tagset;
- an index;
- an action, to be performed by adding features (through "+"), deleting features (through "-"), or through the right side of an A-rule (i.e., prefixation, suffixation, infixation); or
- a <RELATION> itself (i.e., rules may be recursive).

Observations

The <CONDITION> field may be empty in change, create and delete rules, in case of unconditional change, creation or deletion. It is obligatory in replace rules

VA(+C); (add the feature C to all adjuncts to the head in the verbal phrase)
+VA("a"); (add an adjunct "a" to the head of the verbal phrase, whatever the case)
-VA(C); (delete all adjuncts to the head of the verbal phrase that have the feature C)

The <HEAD> and the <ARGUMENT> may be empty in case of no change. Empty heads are automatically extended

Binary relations (?A, ?S, ?C)

VA; (no head nor argument: the relation is automatically extended to "VA(;);" )
VA(); (same as above)
VA(;); (same as above)
VA("a"); (argument only: the relation is automatically extended to "VA(;"a");" )
VA("a";); (head only)
VA("a";"b"); (head and argument)

Unary relations (?H)

VH; (no head: the relation is automaticall extended to "VH();" )
VH(); (same as above)
VH("a"); (head)

Relations are always juxtaposed (they must not be separated by ",")

VS("b")VC("c")VA("d");

~~VS("b"),VC("c"),VA("d");~~

Order is not important between relations, but essential between constituents of the same relation

VS("b")VC("c")VA("d") = VC("c")VA("d")VS("b") = VA("d")VC("c")VS("b")

VA("a";"b"); ≠ VA("b";"a");

Arguments of relations may be expressed by A-rules, but only in the right side of rules

VA(0>"a"); (the verbal adjuncts, if any, receive an "a" as suffix)

Rules are conservative. Features will be preserved unless explicitly deleted (through "-")

VC(%comp,ACC):=VC(%comp,NOM); (is the same as "VC(%comp,ACC):=VC(%comp,+NOM);" i.e., add the feature "NOM" to the complements of verb that have the feature "ACC"; the feature "ACC" will be preserved and not replaced by "NOM")

VC(%comp,%ACC):=VC(%comp,NOM,-ACC); (add the feature "NOM" and delete the feature "ACC" from the complements of the verb that have the feature "ACC")

A node may have as many features as necessary, but one single string or lemma

VC(%comp,"a"):=VC(%comp,"b"); ("a" is replaced by "b")

Strings are represented between quotes if invariable, or between brackets if variable (lemmas)

VA("into account"); (the string "into account" is an adjunct to the verb)

IH([be]); (the lemma "be" is the head of inflectional phrase)

Negation

"^" is used for negation, and may be applied over features, strings or relations:

VA(^NOU); (if the adjunct does not have the feature "NOU")
VA(^"a"); (if the adjunct is not the string "a")
^VA("a"); (if there is no VA relation between the head and "a")

S-rules always end in ";"

VA("a");
~~VA("a")~~

Indexes

Nodes are always indexed in S-rules

Indexes (%) are used for indexing nodes, attributes and values inside and between the left (condition) and the right side of rules.

X(%a;%b)Y(%a;%c); (the head of X is also the head of Y)

Indexes as variables

Indexes are features and may be used as variables

X(%a;%b)Y(%a;%c):=Z(%b;%c); (if the head of the relation X is the head of the relation Y, delete X and Y and create Z between the arguments of X and Y)
X(%a,A;%b,B):=X(%a;%b,+C,-B); (add the feature C to the argument of X and remove the feature B from it if the head of X has the feature A)

If omitted, indexes are assigned by default, according to the position

X(A;B)Y(C;D)Z(E;F); is the same as X(A,%01;B,%02)Y(C,%03;D,%04)Z(E,%05;F,%06);
X(A;B):=X(;+C,-B); is the same as X(A,%01;B,%02):=X(%01;+C,-B,%02);
X(A;B):=X(+C,-B); is the same as X(A,%01;B,%02):=X(%01;+C,-B,%02); (same as above: the relation is automatically extended if the head is empty)

However

X(A;B)Y(A;C):=Z(B;C); is different from X(%a;%b)Y(%a;%c):=Z(%b;%c);

X(A;B)Y(A;C):=Z(B;C); is the same as X(A,%01;B,%02)Y(A,%03;C,%04):=Z(B,%01;C,%02); while
X(A,%a;B,%b)Y(A,%a;C,%c):=Z(B,%b;C,%c); is the same as X(A,%01 ;B,%02)Y(A,%01;C,%04):=Z(B,%02;C,%04);

In the first case, the feature B is added to the head of X and the feature C is added to its argument; the relation Y is deleted. In the second case, the feature C is added to the argument of Y, and Z is made between the arguments of X and Y>

If omitted, right side indexes are automatically co-indexed with the left side ones

X(;):=Y(;); is the same as X(%01;%02):=Y(%01;%02);

Right side indexes are to explicitly defined if order is to be altered

X(;):=Y(%02;%01);

Indexes can be replaced by user-defined labels made of any sequence of alphabetic characters and underscore

X(A,%a;B,%b)Y(C,%c;D,%d)Z(E,%e;F,%f)

%01 = A, %02 = B, %03 = C, %04 = D, %05 = E, %06 = F and

%a = A, %b = B, %c = C, %d = D, %e = E, %f = F

Numeric characters cannot be used as user-defined indexes

X(A,%03;B,%05)

%01 = A, %02 = B (there is no %03 nor %05)

To avoid ambiguities, users are strongly recommended to replace default values by customized labels

X(A,%a;B,%b)

instead of simply X(A;B) or X(A,%01;B,%02)

In case of sub-nodes, the parent node must be informed by the syntax <PARENT NODE><CHILD NODE>, where <PARENT NODE> may be, itself, a sub-node

X(Y(A;B);C)

%01 = Y(A;B), %02 = C, %01%01 = A, %01%02 = B

X(Y(Z(A;B);C);D)

%01 = Y(Z(A;B);C), %02 = D, %01%01 = Z(A;B), %01%02 = C, %01%01%01 = A, %01%01%02 = B

Indexation is not affected by repetition

X(A;B)Y(A;C)Z(A;D)

%01 = A, %02 = B, %03 = A, %04 = C, %05 = A, %06 = D (and %01 = %03 = %05)

Empty nodes are also indexed

X(;)

%01 = first node of X, %02 = second node of X

Indexes may be used both in the left and in the right side of rules: X(%a;%b):=Y(%b;%a); (the first node of the X relation becomes the second node of the Y relation); X(%a;)Y(%a;):=Z(%a); (if the first node of the X relation is the first node of the Y relation then make it the single node of a Z relation)
Indexes may also be used to transfer attribute values expressed in the format ATTRIBUTE=VALUE: X(A,%a,ATT1=VAL1;B,%b):=X(%a;%b,ATT1=%a); (the value "VAL1" of "ATT1" of %a is copied to the node %b)

Examples

Examples of S-rules:

composition
- VA("into account"); (add the string "into account" as the adjunct of the verb)
subcategorization
- VC(PH("in")); (the complement of the verb is a prepositional phrase headed by the preposition "in")
agreement
- VS(ANUM,APER); (the specifier of the verb assigns number (ANUM) and person (APER) to its head
case marking
- VS(NOM); (the specifier of the verb receives the case nominative (NOM)
distribution
- VA(>>); (the adjunct of the verb comes at the right side of the verb after a blank space)
adjacency
- VA(AJ2); (the adjunct of the verb integrates the second projection of the head)
projection
- VS(%head;%spec)VB(%head;%comp):=VP(VB(%head;%comp);%spec); (integrate the two relations on the left side into a single relation)

Formal Syntax

S-rules comply with the following formal syntax:

<S-RULE>                ::= <CONDITION> ":=" (<SYNTACTIC RELATION>)+";"
<CONDITION>             ::= <TAG>(","<TAG>)* | (<SYNTACTIC RELATION>)*
<SYNTACTIC RELATION>    ::= <HEAD-DRIVEN RELATION> "(" (<NODE>";")? <NODE> ")"
<HEAD-DRIVEN RELATION>  ::= {one of the head-driven syntactic relations defined in the UNDLF Tagset} 
<NODE>                  ::= <FEATURE>(","<FEATURE>)* 
<FEATURE>               ::= <ID>|<TAG>|"""<STRING>"""|"["<STRING>"]"|<DIRECTION>|<SYNTACTIC RELATION>|<ACTION>
<ID>                    ::= "%"[a-zA-Z_0-9]+
<TAG>                   ::= {one of the tags defined in the UNDLF Tagset}
<STRING>                ::= [a..Z]+
<DIRECTION>             ::= ">"|">>"|"<"|"<<"
<ACTION>                ::= <PREFIXATION> | <SUFFIXATION> | <INFIXATION> | <REPLACEMENT> (cf. A-rule)

where
<a> = a is a non-terminal symbol
"a" = a is a constant
a | b = a or b
(a)? = a can be repeated 0 or one time
(a)* = a can be repeated 0 or more times
(a)+ = a can be repeated 1 or more times

@@ Line 96: / Line 96: @@
 ;Nodes are always indexed in S-rules
 :Indexes (%) are used for indexing nodes, attributes and values inside and between the left (condition) and the right side of rules.
-:*X(%a;%b)Y(%a;%c); (the head of X is also the head of Y)
+:*X('''%a''';'''%b''')Y('''%a''';'''%c'''); (the head of X is also the head of Y)
 ;Indexes as variables
 :Indexes are features and may be used as variables
-:*X(%a;%b)Y(%a;%c):=Z(%b;%c); (if the head of the relation X is the head of the relation Y, delete X and Y and create Z between the arguments of X and Y)
+:*X('''%a''';'''%b''')Y('''%a''';'''%c'''):=Z('''%b''';'''%c'''); (if the head of the relation X is the head of the relation Y, delete X and Y and create Z between the arguments of X and Y)
-:*X(%a,A;%b,B):=X(%a;+C,-B); (add the feature C to the argument of X and remove the feature B from it if the head of X has the feature A)
+:*X('''%a''',A;'''%b''',B):=X('''%a''';'''%b''',+C,-B); (add the feature C to the argument of X and remove the feature B from it if the head of X has the feature A)
 ;If omitted, indexes are assigned by default, according to the position:
-:*X(A;B)Y(C;D)Z(E;F); is the same as X(A,%01;B,%02)Y(C,%03;D,%04)Z(E,%05;F,%06);
+:*X(A;B)Y(C;D)Z(E;F); is the same as X(A,'''%01''';B,'''%02''')Y(C,'''%03''';D,'''%04''')Z(E,'''%05''';F,'''%06''');
-:*X(A;B):=X(;+C,-B); is the same as X(A,%01;B,%02):=X(%01;+C,-B,%02);
+:*X(A;B):=X(;+C,-B); is the same as X(A,'''%01''';B,'''%02'''):=X('''%01''';+C,-B,'''%02''');
-:*X(A;B):=X(+C,-B); is the same as X(A,%01;B,%02):=X(%01;+C,-B,%02); (same as above: the relation is automatically extended if the head is empty)
+:*X(A;B):=X(+C,-B); is the same as X(A,'''%01''';B,'''%02'''):=X('''%01''';+C,-B,'''%02'''); (same as above: the relation is automatically extended if the head is empty)
 :However
 :*X(A;B)Y(A;C):=Z(B;C); is different from X(%a;%b)Y(%a;%c):=Z(%b;%c);
-::*X(A;B)Y(A;C):=Z(B;C);  is the same as X(A,%01;B,%02)Y(A,%03;C,%04):=Z(B,%01;C,%02); (i.e., the feature B is added to the head of X and the feature C is added to its argument. The relation Y is deleted)
+::*X(A;B)Y(A;C):=Z(B;C);       is the same as X(A,'''%01''';B,'''%02''')Y(A,'''%03''';C,'''%04'''):=Z(B,'''%01''';C,'''%02''');  while
-::*X(%a;%b)Y(%a;%c):=Z(%b;%c); is the same as X(%01;%02)Y(%01;%04):=Z(%02;%04);
+::*X(A,%a;B,%b)Y(A,%a;C,%c):=Z(B,%b;C,%c); is the same as X(A,'''%01'''  ;B,'''%02''')Y(A,'''%01''';C,'''%04'''):=Z(B,'''%02''';C,'''%04''');
+::In the first case, the feature B is added to the head of X and the feature C is added to its argument; the relation Y is deleted. In the second case, the feature C is added to the argument of Y, and Z is made between the arguments of X and Y>
 ;If omitted, right side indexes are automatically co-indexed with the left side ones:
-:*X(;):=Y(;); = X(%01;%02):=Y(%01;%02);
+:*X(;):=Y(;); is the same as X('''%01''';'''%02'''):=Y('''%01''';'''%02''');
 ;Right side indexes are to explicitly defined if order is to be altered
 :*X(;):=Y(%02;%01);

S-rule: Difference between revisions

Revision as of 07:56, 26 March 2010

Contents

When to use S-rules

When not to use S-rules

Types of S-rules

Observations

Indexes

Examples

Formal Syntax

Navigation menu

Page actions

Page actions

Personal tools

UNL

Search

Lingware

Software

UNL Program

Navigation

Tools

LANGUAGES'

Navigation