|
|
| Line 1: |
Line 1: |
| Normalization is the process of normalizing the input document in order to be better processed. It is carried by [[N-rules]] and includes:
| | #REDIRECT [[N-rule]] |
| *replacing abbreviations by their corresponding extended forms
| |
| *replacing short forms by their corresponding long forms
| |
| *replacing periphrases direct forms
| |
| *replacing contractions by their components
| |
| *defining processing units
| |
| | |
| == Replacement ==
| |
| Replacement is carried by [[N-rules]] written as follows:
| |
| ({SHEAD|" "})("don’t")({STAIL|" "}):=()("do not")();
| |
| ({SHEAD|" "})("art. ")({STAIL|" "}):=()("article")();
| |
| ({SHEAD|" "})("aux")({STAIL|" "}):=()("à les")();
| |
| Where:
| |
| *SHEAD = beginning of the sentence
| |
| *STAIL = end of the sentence
| |
| *({SHEAD|" "}) indicates left context (i.e., either SHEAD or blank space)
| |
| *({STAIL|" "}) indicates right context (i.e., either SHEAD or blank space)
| |