Normalization: Difference between revisions

Latest revision as of 15:02, 16 July 2014

Redirect to:

@@ Line 1: / Line 1: @@
-Normalization is the process of normalizing the input document in order to be better processed. It is carried by [[N-rules]] and includes:
+#REDIRECT [[N-rule]]
-*replacing abbreviations by their corresponding extended forms
-*replacing short forms by their corresponding long forms
-*replacing periphrases direct forms
-*replacing contractions by their components
-*defining processing units
-== Replacement ==
-Replacement is carried by [[N-rules]] written as follows:
- ({SHEAD|" "})("don’t")({STAIL|" "}):=()("do not")();
- ({SHEAD|" "})("art. ")({STAIL|" "}):=()("article")();
- ({SHEAD|" "})("aux")({STAIL|" "}):=()("à les")();
-Where:
-*SHEAD = beginning of the sentence
-*STAIL = end of the sentence
-*({SHEAD|" "}) indicates left context (i.e., either SHEAD or blank space)
-*({STAIL|" "}) indicates right context (i.e., either SHEAD or blank space)