The Malicious Forensic Texts corpus

Download the corpus

The Malicious Forensic Texts (MFT) corpus is a corpus of authentic malicious forensic texts that has been compiled in order to study their register variation. The results of this analysis have been published in

Nini, A. (2017). Register variation in malicious forensic texts. International Journal of Speech, Language and the Law, 24(1), 67-98
[Read] [Pre-print]

where a malicious forensic text is defined as

a text that is a piece of written evidence in a forensic case that involves threat, abuse, defamation or a combination of the above

The open access version of the MFT corpus that is possible to download here is slightly different from the corpus presented in the paper above.

Firstly, all the texts that are not publicly available have not been included in this version of the corpus for confidentiality reasons.

Secondly, although the analysis reported in the paper does not consider those texts shorter than 100 word tokens, this version of the corpus includes them.

If you want to use this corpus for your own research you can do so but please reference the paper above and include an explanation that clarifies that this version is different from the one used in the paper.

If you have any issues, questions, or doubts, email me at