This page describes how rule developers can make use of the n-gram data to detect errors. As an introduction and for set-up, please see Finding errors using n-gram data first.
To add word pairs, simply add them to the 
resource/<languageCode>/confusion_sets.txt file of your language. If 
such a file doesn’t exist for your language, ask the 
developers to add one. However, you 
need massive amounts of ngram data which is easily available only for 
English, German, French, Spanish, Italian, Russian, Chinese, and 
Hebrew. Here’s an example line from that file:
word1; word2; 10     # comment
In this example, word1 and word2 are two similar words that can 
easily be confused. The order of these words doesn’t matter to LT, but 
we recommend sorting them alphabetically. 10 is a factor that 
determines how much the word used in the text is preferred. If you use 
1 here, the other word will be suggested as a correction even when 
it’s only a little bit more probable than the word from the text. In 
many cases, this would lead to false alarms. To avoid those, use a 
larger factor, often something from 10 to 10000000. To decide 
what’s a good compromise between finding many errors and not creating 
too many false alarms, use ConfusionRuleEvaluator.java:
Run ConfusionRuleEvaluator.java from your IDE. It will print the 
options you need to provide: the token pair, the language code, the 
ngram directory, and an evaluation file with example sentences that 
contain the words in their correct context. It will show output like 
this:
Factor:   10 - 5 false positives, 11 false negatives
Summary:  p=0.997, r=0.994, 2000, 3grams, 2015-09-16
Factor:   100 - 2 false positives, 22 false negatives
Summary:  p=0.999, r=0.989, 2000, 3grams, 2015-09-16
Factor:   1000 - 2 false positives, 42 false negatives
Summary:  p=0.999, r=0.979, 2000, 3grams, 2015-09-16
p= is the precision value, i.e. the probability that a correct use of 
a word is detected as correct. This value needs to be close to 1, as 
the difference between 1 and the value indicates the amount of false 
alarms the rule will throw. r= is the recall, i.e. the probability 
that an incorrect use of a word is detected as incorrect. The closer 
this is to 1, the better, but a high precision should be preferred. 
You can now choose the factor value that has a good compromise between 
precision and recall and add it as the third value for this pair of 
words in confusion_sets.txt.
There’s also the chance that even a very high factor doesn’t cause a 
good precision. As a rule of thumb, the precision should be at least 
0.995 for very common words and 0.99 for other words. If that’s not 
possible with any factor, or if the recall is very low (< 0.5), this 
pair of words might simply not be a good fit for ngram-based error 
detection.
Of course the result of evaluation depends on what you use as input: 
the input itself may contain errors. That’s why 
ConfusionRuleEvaluator.java will also print the cases that are 
probably false alarms. If they aren’t, you should probably clean up the 
input. For English, we use a combination of Wikipedia and Tatoeba as 
input. The fewer of the affected words appear in Wikipedia and Tatoeba, 
the less meaningful the evaluation output will become. For English, we 
try to use at least 1000 example sentences for each word.
your and you're cannot be detected,
as  you're is internally split into more than one token.