New pre-print: “Authorship Impersonation via LLM Prompting does not Evade Authorship Verification Methods”

I’m pleased to announce the pre-print of a new article on LLM impersonation, with Baoyi Zeng as first author. The paper shows that current state-of-the-art authorship verification methods tend not to be fooled by an LLM trying to impersonate someone simply using prompting. Several high profile forensic linguistic cases involved the perpetrator manually trying to impersonate someone, such as the victim. We show that if a perpetrator tried to use an LLM to do so these methods would not be misled. The paper is on arXiv and can be found here: https://arxiv.org/abs/2603.29454.

Assessing the suitability of forensic authorship analysis methodologies for speech data

On Monday Dr James Tompkinson (University of York) and I presented our talk on “Assessing the suitability of forensic authorship analysis methodologies for speech data” at the International Association for Forensic Phonetics and Acoustics (IAFPA) 2025 conference at Leiden University (The Hague), where we show some preliminary results about applying some authorship analysis techniques to transcribed speech. You can find the slides of the talk here: https://zenodo.org/records/16308151.

Examining an author’s individual grammar

On Monday I delivered a talk at the Comparative Literature Goes Digital Workshop at the Digital Humanities 2025 conference. As part of this talk I have also prepared a tutorial to use our new authorship verification method, LambdaG, to produce text heatmaps to study the idiosyncratic language of an author. This Github repository contains the abstract, a link to the tutorial and the slide of my talk: https://github.com/andreanini/lambdaG-case-study-DH2025.

Appearance on the Writing Wrongs podcast

A few months ago, I had the pleasure of being a guest on the ‘Writings Wrongs’ podcast. The episode covered the events of the Aiya Napa rape case and the evidence I presented at the trial. Like all other episodes, the hosts do an amazing job explaining everything in detail but in a really accessible way. I highly recommend this episode, as well as the whole podcast! You can find it here: https://www.aston.ac.uk/research/forensic-linguistics/writing-wrongs

New pre-print: “Linguistic Individuality in Lexicogrammatical Alternations”

My PhD student Michael Cameron has uploaded a pre-print of his latest work, “Linguistic Individuality in Lexicogrammatical Alternations“, which shows with a pre-registered experiment how individuals consistently select the same lexicogrammatical variants over time and do this differently from other individuals. This suggests evidence for personalised entrenchment, which is an important factors in linguistic individuality (with obvious implications for forensic linguistics). You can find the pre-print here: https://doi.org/10.31234/osf.io/uvtrb.

idiolect: a new R package for Forensic Authorship Analysis

I’m pleased to announce the release of version 1 of idiolect, my new package to carry out Forensic Authorship Analysis using R. The website of the package https://andreanini.github.io/idiolect contains a Get Started page with a brief tutorial. The package offers several well-known authorship analysis methods, including our new method LambdaG (https://arxiv.org/abs/2403.08462v1), as well as functions to calibrate Likelihood Ratios so to express the strength of the evidence within the Likelihood Ratio Framework for forensic science.

The package contains functions that cover the typical workflow for authorship analysis for forensic problems:

  1. Input and preprocess data;
  2. Carry out an analysis (Delta, N-gram Tracing, the Impostors Method, LambdaG);
  3. Test the performance of the methods on ground truth data;
  4. Apply the method to the questioned text and calibrate a Likelihood Ratio;
  5. Explore the data using feature importance or other visualisations depending on the method, including using concordances.

Let me know if you have any feedback or questions!