Skip to content

Commit

Permalink
replacing submodule
Browse files Browse the repository at this point in the history
  • Loading branch information
jamesturk committed Sep 7, 2024
1 parent 3370375 commit 1dc5bd5
Show file tree
Hide file tree
Showing 16 changed files with 23,708 additions and 1 deletion.
1 change: 1 addition & 0 deletions csvs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Test data for jellyfish string comparison and phonetic encoding algorithms.
9 changes: 9 additions & 0 deletions csvs/damerau_levenshtein.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
,,0
abc,,3
bc,abc,1
fuor,four,1
abcd,acb,2
cape sand recycling ,edith ann graham,17
jellyifhs,jellyfish,2
ifhs,fish,2
"Hello, world!","Hello, world!",2
8 changes: 8 additions & 0 deletions csvs/hamming.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
,,0
,abc,3
abc,abc,0
acc,abc,1
abcd,abc,1
abc,abcd,1
testing,this is a test,13
Saturday,Sunday,7
16 changes: 16 additions & 0 deletions csvs/jaccard.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
abc,xyz,0.0,
abc,abc,1.0,
abc,abcd,0.0,
abcd,abce,0.0,
abcd,abcde,0.0,
french,quebec,0.0,
france,quebec,0.0,
france,france,1.0,
The quick brown fox jumps over the lazy dog,The quick brown fox jumps over the lazy cat,0.8,
The quick brown fox jumps over the lazy dog,The slow green turtle crawls under the lazy cat,0.2,
John Smith,Smith; John,0.33333,
John Smith,Smith John,1.0,
John Smith,John Jacob Smith,0.666667,
night,nacht,0.0,
night,nacht,0.2,2
night,nacht,0.33333,3
5 changes: 5 additions & 0 deletions csvs/jaro_distance.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
dixon,dicksonx,0.767
martha,marhta,0.944
dwayne,duane,0.822
0ð00,0ð00,1
"Sint-Pietersplein 6, 9000 Gent","Test 10, 1010 Brussel",0.518
17 changes: 17 additions & 0 deletions csvs/jaro_winkler.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
dixon,dicksonx,0.813
martha,marhta,0.961
dwayne,duane,0.84
William,Williams,0.975
,foo,0
a,a,1
abc,xyz,0
aaaa,aaaaa,0.96
orangutan-kumquat,orangutan kumquat,0.976
jaz,jal,0.822
@,@@,0.85
0,0@,0.85
a,ab,0.85
012345,0123456,0.971
012abc,012abcd,0.971
012abc,013abcd,0.879
a1bc,a1be,0.883
10 changes: 10 additions & 0 deletions csvs/jaro_winkler_longtol.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
dixon,dicksonx,0.830
martha,marhta,0.971
dwayne,duane,0.869
William,Williams,0.980
,foo,0
a,a,1
abc,xyz,0
aaaa,aaaaa,0.96
orangutan-kumquat,orangutan kumquat,0.986
1abcdefg,1abcdefh,0.96
6 changes: 6 additions & 0 deletions csvs/levenshtein.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
,,0
abc,,3
,abc,3
bc,abc,1
kitten,sitting,3
Saturday,Sunday,3
12 changes: 12 additions & 0 deletions csvs/match_rating_codex.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Byrne,BYRN
Boern,BRN
Smith,SMTH
Smyth,SMYTH
Catherine,CTHRN
Kathryn,KTHRYN
Kathrynoglin,KTHGLN
Ad,AD
Ed,ED
William,WLM
ä,Ä
Frédéric,FRÉÉRC
7 changes: 7 additions & 0 deletions csvs/match_rating_comparison.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Bryne,Boern,True
Smith,Smyth,True
Catherine,Kathryn,True
Michael,Mike,False
Tim,Timothy,None
Ed,Ad,True
Marie Helene,Maria Rio,True
33 changes: 33 additions & 0 deletions csvs/metaphone.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
DGIB,JB
metaphone,MTFN
wHErE,WR
shell,XL
this is a difficult string,0S IS A TFKLT STRNK
aeromancy,ERMNS
Antidisestablishmentarianism,ANTTSSTBLXMNTRNSM
sunlight labs,SNLT LBS
sonlite laabz,SNLT LBS
Çáŕẗéř,KRTR
kentucky,KNTK
KENTUCKY,KNTK
NXNXNX,NKSNKSNKS
Aapti,PT
Aarti,RT
CIAB,XB
NQ,NK
sian,XN
gek,JK
Hb,HB
Bho,BH
Tiavyi,XFY
Xhot,XHT
Xnot,SNT
g,K
8 queens,KNS
Utah,UT
WH,W
walt,WLT
ANDREW,ANTR
why,W
whynot,WNT
acceptingness,AKSPTNKNS
36 changes: 36 additions & 0 deletions csvs/nysiis.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
Worthy,WARTY
Ogata,OGAT
montgomery,MANTGANARY
Costales,CASTAL
Tu,T
martincevic,MARTANCAFAC
Catherine,CATARAN
Katherine,CATARAN
Katerina,CATARAN
Johnathan,JANATAN
Jonathan,JANATAN
John,JAN
Teresa,TARAS
Theresa,TARAS
Jessica,JASAC
Joshua,JAS
Bosch,BAS
Lapher,LAFAR
wiyh,WY
MacArthur,MCARTAR
Pheenard,FANAD
Schmittie,SNATY
Knaqze,NAGS
Knokno,NAN
Knoko,NAC
Macaw,MC
,
T,T
S,S
P,P
K,C
M,M
E,E
PFEISTER,FASTAR
SARAH,SAR
ç,Ç
Loading

0 comments on commit 1dc5bd5

Please sign in to comment.