ASCII normalization is a custom OurBigBook defined normalization that converts many characters that look like latin characters into latin characters.
For now, we are using the
deburr
method of Lodash: lodash.com/docs/4.17.15#deburr, which only affects latin-like characters.One notable effect is that it converts variants of ASCII letters to ASCII letters. E.g.
é
to e
removing the accent.This operation is kind of a superset of Unicode normalization acting only on Latin-like characters. where unicode basically only removes things like diactricts.
OurBigBook normalization also does other natural transformations that Unicode does not do, e.g.
æ
to ae
.