-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathregex101.html
131 lines (131 loc) · 4.98 KB
/
regex101.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
<h1>tiny regex intro and receipts</h1>
<p><code>Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.</code></p>
<h2>However ...</h2>
<ul>
<li><code>.</code> (a dot) stands for any character</li>
<li><code>*</code> matches the previous item zero or more times</li>
<li><code>+</code> matches the previous item one or more times</li>
<li><code>?</code> matches the previous item zero or one time</li>
<li><code>\</code> escapes special character. <code>\.</code> to match an actual period, <code>\\</code> for an actual backslash</li>
<li><code>\d</code> any digit [1,2,3,4,5,6,7,8,9,0]</li>
<li><code>\D</code> any character that is NOT a digit</li>
<li><code>\s</code> any whitespace character, <code>\S</code> anything not a whitespace character</li>
<li><code>\w</code> any word character (letter, word, underscore), <code>\W</code> anything not a word character</li>
<li><code>[...]</code> anything in the brackets, eg [ab3] matches anything that contains a or b or 3
<ul>
<li><code>[0-9]</code> matches any digit, just like \d, but <code>[1-3]</code> matches only 1 or 2 or 3</li>
<li><code>[A-Z]</code> matches any capital letter</li>
<li><code>[a-Z]</code> matches any letter</li>
<li><code>[a-z]</code> lowercase letters only</li>
</ul>
</li>
<li><code>^</code> beginning of a string (in which to search)</li>
<li><code>$</code> end of the string</li>
<li><code>|</code> logical OR</li>
<li><code>()</code> the full string in brackets, used for grouping, we'll combine that later</li>
</ul>
<p>and so forth ... can google for more, it's powerful.</p>
<h2>some receipts</h2>
<h3>match a 0</h3>
<ul>
<li>we want to match the item_id "0": <code>^0$</code>
<ul>
<li><code>^</code> beginning of the string to match</li>
<li><code>0</code> the item to look for</li>
<li><code>$</code> end of string</li>
<li>just searching for "0" would give any id that contains a zero</li>
<li><code>^0$|^64$|^128$</code> matches the item_id "0" or "64" or "128"</li>
</ul>
</li>
</ul>
<h3>match Gen1 or Gen2, E2 or E3, Voice or Mouth</h3>
<ul>
<li>
<p>assume we want to search for Gen1 and Gen2 items in the collection column.
we can do: <code>^Gen[12]</code></p>
<ul>
<li><code>^</code> marks the beginning of the row (in our case not that neccessary)</li>
<li><code>Gen</code> for obvious reasons</li>
<li><code>[12]</code> either one or two</li>
<li>could also just match for <code>[12]</code> or <code>1|2</code></li>
</ul>
</li>
<li>
<p>we want to search for E2 and E3 items in rarity: <code>E[23]</code></p>
</li>
<li>
<p>we want to search for "Voice" or "Mouth" in the type column: <code>Voice|Mouth</code></p>
</li>
</ul>
<h3>Skin type or Skin color</h3>
<ul>
<li>we want to search for "Skin type" and "Skin color": <code>^Skin (type|color)</code>
<ul>
<li><code>^</code> beginning of the string, not so important in our case</li>
<li><code>Skin </code> the whitespace is important</li>
<li><code>(type|color)</code> either "type" or "color"</li>
<li><code>Skin type|Skin color</code> does the same</li>
<li>and because there aren't any other possibilities with Skin, <code>type|color</code> would do too</li>
</ul>
</li>
</ul>
<h3>Labels containing Green AND Orange</h3>
<ul>
<li><code>(Green.*Orange)|(Orange.*Green)</code>
<ul>
<li><code>(Green.*Orange)</code> finds entries where Green is before Orange with any amount of characters (including comma and whitespace) inbetween</li>
<li><code>|</code> logical OR</li>
<li><code>(Orange.*Green)</code> same as above, but the other way</li>
<li>of course there are more "better" ways to do it ;-)</li>
</ul>
</li>
</ul>
<h3>Red - Rabbit - Lunar New Year</h3>
<ul>
<li>we can use <code>?=</code>, a positive lookahead, to achieve the same,
<ul>
<li><code>(?=.*rabbit)(?=.*red)(?=.*lunar new year)</code> matches red rabbits in the lunar
new year</li>
</ul>
</li>
</ul>
<h3>filter for numbers from 10 to 100</h3>
<ul>
<li>i haven't figured how to have the filter do sensible number filtering, but by regex we could do:
<ul>
<li><code>[1-9][0-9]0?</code> filters for any number from 10 to 100
<ul>
<li><code>[1-9]</code> first digit is 1 to 9</li>
<li><code>[0-9]</code> the second digit, anything from 0 to 9
<ul>
<li>by now we see numbers from ten to 99</li>
</ul>
</li>
<li><code>0?</code> third digit is a zero, matched zero or one time</li>
</ul>
</li>
</ul>
</li>
</ul>
<h3>greater or equal 24</h3>
<ul>
<li><code>([2][4-9])|([3-9][0-9])|(\d{3})</code>
<ul>
<li>we divide into three logical group
<ul>
<li><code>([2][4-9])</code> 24-29</li>
<li><code>([3-9][0-9])</code> 30-99</li>
<li><code>(\d{3})</code> any number of 3 digits length</li>
</ul>
</li>
</ul>
</li>
</ul>
<h3>match different writing of the same thing, headband and head band</h3>
<ul>
<li><code>head ?band</code> or <code>(head)( ?)(band)</code> for easyer readability
<ul>
<li>whitespace followed by a <code>?</code> matches the whitespace zero or one time</li>
</ul>
</li>
</ul>