On 15-02-20 06:47 AM, Jens Østergaard Petersen wrote:
>> > I notice that the standard Apple text editor TextEdit and the Mac word
>> > processor Nisus Writer Pro normalizes when searching while leaving input
>> > unnormalized, not just with accents, but also with ligatures. This shows
>> > that it is possible to have an app which searches for canonically
>> > equivalent strings in unnormalized input (what I dreamt about for oXygen
>> > and what Syd wanted).
>> It's a noble goal. But as someone pointed out, given any substantial
>> quantity of text, the combinatorial explosion of possibilities would be
> Yes, if the script kiddie approach I suggested was used. How these apps
> manage to do this I have no clue: they must plug into some magic in the
> Mac text engine. Anyway, it is not just a noble goal, but actually works
> (no slowdown of any kind can be seen). I would love to hear of apps
> running on other operating systems that do the same.
> The feature also applies to Chocolat <https://chocolatapp.com/>.
The obvious way of doing it is to apply canonical decomposition to both
the search string and the searched text before doing the search; but
you'd have to reconstruct the precise offsets in the original text to
get correct results. But that doesn't account for differences introduced
by compatibility normalization (decomposition of diacritics into
component glyphs). If you put this into Chocolat:
and then search for "woffle", does it find it?