Normalization: Difference between revisions

From UNLwiki
Jump to navigationJump to search
imported>Martins
No edit summary
 
imported>Martins
No edit summary
 
Line 1: Line 1:
Normalization is the process of normalizing the input document in order to be better processed. It is carried by [[N-rules]] and includes:
#REDIRECT [[N-rule]]
*replacing abbreviations by their corresponding extended forms
*replacing short forms by their corresponding long forms
*replacing periphrases direct forms
*replacing contractions by their components
*defining processing units
 
== Replacement ==
Replacement is carried by [[N-rules]] written as follows:
({SHEAD|" "})("don’t")({STAIL|" "}):=()("do not")();
({SHEAD|" "})("art. ")({STAIL|" "}):=()("article")();
({SHEAD|" "})("aux")({STAIL|" "}):=()("à les")();
Where:
*SHEAD = beginning of the sentence
*STAIL = end of the sentence
*({SHEAD|" "}) indicates left context (i.e., either SHEAD or blank space)
*({STAIL|" "}) indicates right context (i.e., either SHEAD or blank space)

Latest revision as of 15:02, 16 July 2014

Redirect to: