Skip to content

Commit 7e0df5e

Browse files
girishjichrisbra
authored andcommitted
patch 9.1.1627: fuzzy matching can be improved
Problem: fuzzy-matching can be improved Solution: Implement a better fuzzy matching algorithm (Girish Palya) Replace fuzzy matching algorithm with improved fzy-based implementation The [current](https://www.forrestthewoods.com/blog/reverse_engineering_sublime_texts_fuzzy_match/) fuzzy matching algorithm has several accuracy issues: * It struggles with CamelCase * It fails to prioritize matches at the beginning of strings, often ranking middle matches higher. After evaluating alternatives (see my comments [here](vim/vim#17531 (comment)) and [here](vim/vim#17531 (comment))), I chose to adopt the [fzy](https://github.com/jhawthorn/fzy) algorithm, which: * Resolves the aforementioned issues. * Performs better. Implementation details This version is based on the original fzy [algorithm](https://github.com/jhawthorn/fzy/blob/master/src/match.c), with one key enhancement: **multibyte character support**. * The original implementation supports only ASCII. * This patch replaces ascii lookup tables with function calls, making it compatible with multibyte character sets. * Core logic (`match_row()` and `match_positions()`) remains faithful to the original, but now operates on codepoints rather than single-byte characters. Performance Tested against a dataset of **90,000 Linux kernel filenames**. Results (in milliseconds) show a **\~2x performance improvement** over the current fuzzy matching algorithm. ``` Search String Current Algo FZY Algo ------------------------------------------------- init 131.759 66.916 main 83.688 40.861 sig 98.348 39.699 index 109.222 30.738 ab 72.222 44.357 cd 83.036 54.739 a 58.94 62.242 b 43.612 43.442 c 64.39 67.442 k 40.585 36.371 z 34.708 22.781 w 38.033 30.109 cpa 82.596 38.116 arz 84.251 23.964 zzzz 35.823 22.75 dimag 110.686 29.646 xa 43.188 29.199 nha 73.953 31.001 nedax 94.775 29.568 dbue 79.846 25.902 fp 46.826 31.641 tr 90.951 55.883 kw 38.875 23.194 rp 101.575 55.775 kkkkkkkkkkkkkkkkkkkkkkkkkkkkk 48.519 30.921 ``` ```vim vim9script var haystack = readfile('/Users/gp/linux.files') var needles = ['init', 'main', 'sig', 'index', 'ab', 'cd', 'a', 'b', 'c', 'k', 'z', 'w', 'cpa', 'arz', 'zzzz', 'dimag', 'xa', 'nha', 'nedax', 'dbue', 'fp', 'tr', 'kw', 'rp', 'kkkkkkkkkkkkkkkkkkkkkkkkkkkkk'] for needle in needles var start = reltime() var tmp = matchfuzzy(haystack, needle) echom $'{needle}' (start->reltime()->reltimefloat() * 1000) endfor ``` Additional changes * Removed the "camelcase" option from both matchfuzzy() and matchfuzzypos(), as it's now obsolete with the improved algorithm. related: neovim/neovim#34101 fixes #17531 closes: #17900 Signed-off-by: Girish Palya <[email protected]> Signed-off-by: Christian Brabandt <[email protected]>
1 parent 5ba6e41 commit 7e0df5e

25 files changed

Lines changed: 1289 additions & 1311 deletions

Filelist

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -82,6 +82,7 @@ SRC_ALL = \
8282
src/findfile.c \
8383
src/float.c \
8484
src/fold.c \
85+
src/fuzzy.c \
8586
src/getchar.c \
8687
src/gc.c \
8788
src/globals.h \
@@ -291,6 +292,7 @@ SRC_ALL = \
291292
src/proto/findfile.pro \
292293
src/proto/float.pro \
293294
src/proto/fold.pro \
295+
src/proto/fuzzy.pro \
294296
src/proto/getchar.pro \
295297
src/proto/gc.pro \
296298
src/proto/gui.pro \

runtime/doc/builtin.txt

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
*builtin.txt* For Vim version 9.1. Last change: 2025 Aug 10
1+
*builtin.txt* For Vim version 9.1. Last change: 2025 Aug 12
22

33

44
VIM REFERENCE MANUAL by Bram Moolenaar
@@ -7421,9 +7421,6 @@ matchfuzzy({list}, {str} [, {dict}]) *matchfuzzy()*
74217421
given sequence.
74227422
limit Maximum number of matches in {list} to be
74237423
returned. Zero means no limit.
7424-
camelcase Use enhanced camel case scoring making results
7425-
better suited for completion related to
7426-
programming languages. Defaults to v:true.
74277424

74287425
If {list} is a list of dictionaries, then the optional {dict}
74297426
argument supports the following additional items:

runtime/doc/pattern.txt

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
*pattern.txt* For Vim version 9.1. Last change: 2025 Aug 06
1+
*pattern.txt* For Vim version 9.1. Last change: 2025 Aug 12
22

33

44
VIM REFERENCE MANUAL by Bram Moolenaar
@@ -1509,6 +1509,9 @@ characters in the search string. If the search string has multiple words, then
15091509
each word is matched separately. So the words in the search string can be
15101510
present in any order in a string.
15111511

1512+
Vim uses the same improved algorithm as the fzy project:
1513+
https://github.com/jhawthorn/fzy
1514+
15121515
Fuzzy matching assigns a score for each matched string based on the following
15131516
criteria:
15141517
- The number of sequentially matching characters.

runtime/doc/version9.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41723,6 +41723,8 @@ Functions: ~
4172341723
- Add the optional {opts} |Dict| argument to |getchar()| to control: cursor
4172441724
behaviour, return type and whether or not to simplify the returned key
4172541725
- |chdir()| allows to optionally specify a scope argument
41726+
- |matchfuzzy()| and |matchfuzzypos()| use an improved fuzzy matching
41727+
algorithm (same as fzy).
4172641728

4172741729
Others: ~
4172841730
- the regex engines match correctly case-insensitive multi-byte characters

src/Make_ami.mak

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,6 +113,7 @@ SRC += \
113113
findfile.c \
114114
float.c \
115115
fold.c \
116+
fuzzy.c \
116117
getchar.c \
117118
gc.c \
118119
hardcopy.c \

src/Make_cyg_ming.mak

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -823,6 +823,7 @@ OBJ = \
823823
$(OUTDIR)/findfile.o \
824824
$(OUTDIR)/float.o \
825825
$(OUTDIR)/fold.o \
826+
$(OUTDIR)/fuzzy.o \
826827
$(OUTDIR)/getchar.o \
827828
$(OUTDIR)/gc.o \
828829
$(OUTDIR)/gui_xim.o \

src/Make_mvc.mak

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -732,6 +732,7 @@ OBJ = \
732732
$(OUTDIR)\findfile.obj \
733733
$(OUTDIR)\float.obj \
734734
$(OUTDIR)\fold.obj \
735+
$(OUTDIR)\fuzzy.obj \
735736
$(OUTDIR)\getchar.obj \
736737
$(OUTDIR)\gc.obj \
737738
$(OUTDIR)\gui_xim.obj \
@@ -1616,6 +1617,8 @@ $(OUTDIR)/float.obj: $(OUTDIR) float.c $(INCL)
16161617

16171618
$(OUTDIR)/fold.obj: $(OUTDIR) fold.c $(INCL)
16181619

1620+
$(OUTDIR)/fuzzy.obj: $(OUTDIR) fuzzy.c $(INCL)
1621+
16191622
$(OUTDIR)/getchar.obj: $(OUTDIR) getchar.c $(INCL)
16201623

16211624
$(OUTDIR)/gc.obj: $(OUTDIR) gc.c $(INCL)
@@ -1961,6 +1964,7 @@ proto.h: \
19611964
proto/filepath.pro \
19621965
proto/findfile.pro \
19631966
proto/float.pro \
1967+
proto/fuzzy.pro \
19641968
proto/getchar.pro \
19651969
proto/gc.pro \
19661970
proto/gui_xim.pro \

src/Make_vms.mms

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -529,6 +529,7 @@ SRC = \
529529
findfile.c \
530530
float.c \
531531
fold.c \
532+
fuzzy.c \
532533
getchar.c \
533534
gc.c \
534535
gui_xim.c \
@@ -665,6 +666,7 @@ OBJ = \
665666
[.$(DEST)]findfile.obj \
666667
[.$(DEST)]float.obj \
667668
[.$(DEST)]fold.obj \
669+
[.$(DEST)]fuzzy.obj \
668670
[.$(DEST)]getchar.obj \
669671
[.$(DEST)]gc.obj \
670672
[.$(DEST)]gui_xim.obj \
@@ -1141,6 +1143,10 @@ lua_env :
11411143
[.$(DEST)]fold.obj : fold.c vim.h [.$(DEST)]config.h feature.h os_unix.h \
11421144
ascii.h keymap.h termdefs.h macros.h structs.h regexp.h gui.h beval.h \
11431145
[.proto]gui_beval.pro option.h ex_cmds.h proto.h errors.h globals.h
1146+
[.$(DEST)]fuzzy.obj : fuzzy.c vim.h [.$(DEST)]config.h feature.h os_unix.h \
1147+
ascii.h keymap.h termdefs.h macros.h structs.h regexp.h \
1148+
gui.h beval.h [.proto]gui_beval.pro option.h ex_cmds.h proto.h \
1149+
errors.h globals.h
11441150
[.$(DEST)]getchar.obj : getchar.c vim.h [.$(DEST)]config.h feature.h os_unix.h \
11451151
ascii.h keymap.h termdefs.h macros.h structs.h regexp.h \
11461152
gui.h beval.h [.proto]gui_beval.pro option.h ex_cmds.h proto.h \

src/Makefile

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1523,6 +1523,7 @@ BASIC_SRC = \
15231523
findfile.c \
15241524
float.c \
15251525
fold.c \
1526+
fuzzy.c \
15261527
getchar.c \
15271528
gc.c \
15281529
gui_xim.c \
@@ -1701,6 +1702,7 @@ OBJ_COMMON = \
17011702
objects/findfile.o \
17021703
objects/float.o \
17031704
objects/fold.o \
1705+
objects/fuzzy.o \
17041706
objects/getchar.o \
17051707
objects/gc.o \
17061708
objects/gui_xim.o \
@@ -1886,6 +1888,7 @@ PRO_AUTO = \
18861888
findfile.pro \
18871889
float.pro \
18881890
fold.pro \
1891+
fuzzy.pro \
18891892
getchar.pro \
18901893
gc.pro \
18911894
gui_xim.pro \
@@ -3309,6 +3312,9 @@ objects/float.o: float.c
33093312
objects/fold.o: fold.c
33103313
$(CCC) -o $@ fold.c
33113314

3315+
objects/fuzzy.o: fuzzy.c
3316+
$(CCC) -o $@ fuzzy.c
3317+
33123318
objects/getchar.o: getchar.c
33133319
$(CCC) -o $@ getchar.c
33143320

@@ -3988,6 +3994,11 @@ objects/fold.o: fold.c vim.h protodef.h auto/config.h feature.h os_unix.h \
39883994
proto/gui_beval.pro structs.h regexp.h gui.h libvterm/include/vterm.h \
39893995
libvterm/include/vterm_keycodes.h alloc.h ex_cmds.h spell.h proto.h \
39903996
globals.h errors.h
3997+
objects/fuzzy.o: fuzzy.c vim.h protodef.h auto/config.h feature.h os_unix.h \
3998+
auto/osdef.h ascii.h keymap.h termdefs.h macros.h option.h beval.h \
3999+
proto/gui_beval.pro structs.h regexp.h gui.h libvterm/include/vterm.h \
4000+
libvterm/include/vterm_keycodes.h alloc.h ex_cmds.h spell.h proto.h \
4001+
globals.h errors.h
39914002
objects/getchar.o: getchar.c vim.h protodef.h auto/config.h feature.h os_unix.h \
39924003
auto/osdef.h ascii.h keymap.h termdefs.h macros.h option.h beval.h \
39934004
proto/gui_beval.pro structs.h regexp.h gui.h libvterm/include/vterm.h \

src/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@ fileio.c | reading and writing files
4848
filepath.c | dealing with file names and paths
4949
findfile.c | search for files in 'path'
5050
fold.c | folding
51+
fuzzy.c | fuzzy matching
5152
getchar.c | getting characters and key mapping
5253
gc.c | garbage collection
5354
help.c | vim help related functions

0 commit comments

Comments
 (0)