Having had a quick look at this, I see that the concept of canonical equivalence is built into (the Mac OS and iOS programming languages) Objective-C and Swift string objects at a very low level, allowing string comparisons to easily take into account the full range of Unicode subtleties. In fact, in Swift, the basic string comparison operator "==" operates on canonically equivalent "extended grapheme clusters" and you have to use special methods to look "below" these. As far as I can judge, this amounts to Unicode conformance in this area, but the interface of a pro text editor should ideally also offer a way to look inside these abstract characters (and apply the different normalizations to the underlying UTF-16 units).
I can't imagine anyone having the time to read this, but I list here some of the pages I found helpful.
On 20 Feb 2015 at 17:47:20, Jens Østergaard Petersen ([log in to unmask]) wrote:
The obvious way of doing it is to apply canonical decomposition to
the search string and the searched text before doing the search;
you'd have to reconstruct the precise offsets in the original text
get correct results. But that doesn't account for differences
by compatibility normalization (decomposition of diacritics
component glyphs). If you put this into Chocolat:
and then search for "woffle", does it find it?
Yes. So do the other apps I mentioned.