Having had a quick look at this, I see that the concept of canonical equivalence is built into (the Mac OS and iOS programming languages) Objective-C and Swift string objects at a very low level, allowing string comparisons to easily take into account the full range of Unicode subtleties. In fact, in Swift, the basic string comparison operator "==" operates on canonically equivalent "extended grapheme clusters" and you have to use special methods to look "below" these. As far as I can judge, this amounts to Unicode conformance in this area, but the interface of a pro text editor should ideally also offer a way to look inside these abstract characters (and apply the different normalizations to the underlying UTF-16 units).

I can't imagine anyone having the time to read this, but I list here some of the pages I found helpful.



On 20 Feb 2015 at 17:47:20, Jens Østergaard Petersen ([log in to unmask]) wrote:

On 20 Feb 2015 at 17:06:52, Martin Holmes ([log in to unmask]) wrote:
The obvious way of doing it is to apply canonical decomposition to both 
the search string and the searched text before doing the search; but 
you'd have to reconstruct the precise offsets in the original text to 
get correct results. But that doesn't account for differences introduced 
by compatibility normalization (decomposition of diacritics into 
component glyphs). If you put this into Chocolat: 


and then search for "woffle", does it find it? 

Yes. So do the other apps I mentioned.