Write intent

I just wanna propose another Regex since Nik’s will work for anything (m_m, 4>3, 12+A will all match):

^(\+|\-)[a-zA-Z]{3}\d+$

Nik’s works for everything containing +, letters, numbers, and every ASCII character between + and a, all optional and in any order (basically the only thing that will break the pattern are the ASCII characters #32 to #42).

Mine works for tokens starting with + or -, followed by 3 uppercase or lowercase letters, then end with at least one digit.

In my Regex above, any amount of numbers at the end will be taken. If you want for example a minimum of 1 number and a maximum of 4, you can do the following:

^(\+|\-)[a-zA-Z]{3}\d{1,4}$

And, as Nik said, you need to keep at least 2 examples and add RegexFeaturizer and RegexEntityExtractor in your pipeline.

1 Like