Before 2021-04-20, LanguageTool only knew PRP
(like ‘I’, ‘me’, ‘myself’) and PRP$
(like ‘my’ and ‘mine’). Now, you can distinguish more precisely and synthesize one form from another (e.g. ‘I’ from ‘myself’).
This change has one immediate impact: If you want to negate the postag PRP
, you’ll now have to negate PRP(_.*)?
. If you want to negate the postag PRP\$
, you’ll now have to negate PRP\$(_.*)?
.
Example:
Before:
<token postag="PRP"><exception postag="PRP" negate_pos="yes"/></token>
Now:
<token postag="PRP"><exception postag="PRP(_.*)?" postag_regexp="yes" negate_pos="yes"/>
Also: If you want to write an exception
for PRP
(or PRP\$
), you might have to use PRP(_.*)?
(or PRP\$(_.*)?
) for the exception to work.
If you use the new postags to convert, e.g. an object pronoun to a possessive pronoun, there is a conflict with the interpretation of the $
no matter if you escape it or not.
These two ways of writing it will make the test fail:
<match no="3" postag="PRP_O(.*)" postag_replace="PRP$_P$1" postag_regexp="yes"/>
<match no="3" postag="PRP_O(.*)" postag_replace="PRP\$_P$1" postag_regexp="yes"/>
This is how you make it work:
<match no="3" postag="PRP_O(.*)" postag_replace="PRP._P$1" postag_regexp="yes"/>
→ use a .
instead of the contentious $
sign. It’s unambiguous in that position anyway.
Rule of thumb: Inexplicable build fails can go away if you replace \$
or $
with .
.
The last 2 or 3 characters in these tags refer to person [123], number [SP], and, if applicable, gender [MFN].
E.g., ‘her’ is an object pronoun (PRP_O.*
) and the last 3 characters are 3SF
, so the complete tag is PRP_O3SF
.
I I PRP_S1S
you you PRP_S2S
you you PRP_S2P
he he PRP_S3SM
she she PRP_S3SF
it it PRP_S3SN
we we PRP_S1P
they they PRP_S3P
they they PRP_S3S
me I PRP_O1S
you you PRP_O2S
him he PRP_O3SM
her she PRP_O3SF
it it PRP_O3SN
them they PRP_O3S
us we PRP_O1P
you you PRP_O2P
them they PRP_O3P
myself I PRP_R1S
yourself you PRP_R2S
himself he PRP_R3SM
herself she PRP_R3SF
itself it PRP_R3SN
themself they PRP_R3S
ourselves we PRP_R1P
yourselves you PRP_R2P
themselves they PRP_R3P
There is a new level of distinction for PRP$
as well. We distinguish “adjective-like” possessives (e.g. ‘my’) and proper “possessive pronouns” (e.g. ‘mine’).
my I PRP$_A1S
your you PRP$_A2S
his he PRP$_A3SM
her she PRP$_A3SF
its it PRP$_A3SN
their they PRP$_A3S
our we PRP$_A1P
your you PRP$_A2P
their they PRP$_A3P
mine I PRP$_P1S
yours you PRP$_P2S
his he PRP$_P3SM
hers she PRP$_P3SF
theirs they PRP$_P3S
ours we PRP$_P1P
yours you PRP$_P2P
theirs they PRP$_P3P