The abstract a new paper on textual mining of 10-Ks:
We present a new approach in financial content analysis to determine the strength of various words in conveying positive or negative tone. We apply our approach to quantify the tone of 10-K filings and find a significant relation between document tone and market reaction for both negative and positive words. Previous research has not been successful using positive words to quantify tone. We find that our measure of positive and negative tone is significantly related to filing period returns after controlling for factors such as earning announcement date return and accruals, while the earlier approaches in the literature are not. In addition, we find that the appropriate choice of term weighting in content analysis at least as important, and perhaps more important, than a complete and accurate compilation of the word list. We find that the market underreacts to the tone of 10-K’s, and this underreaction is corrected over the next two weeks.