I'm going to go out on a limb and say that LexisNexis and Thompson Reuters didn'...

I'm going to go out on a limb and say that LexisNexis and Thompson Reuters didn't do nearly enough (if any) taxonomical engineering of the corpus before deciding to just do some sliding window chunking. Without that, the whole "think/plan" part of a natural language query pipeline is all but useless.

I just want to bang a marching bass drum while walking through their office and continually shout, "you still have to do the messy part!"