Out-of-distribution detection of machine generated text

Master Thesis
Συγγραφέας
Kampouridis, Prodromos
Καμπουρίδης, Πρόδρομος
Ημερομηνία
2026-02Επιβλέπων
Stamatatos, EfstathiosΣταματάτος, Ευστάθιος
Προβολή/ Άνοιγμα
Λέξεις κλειδιά
Out-of-distribution detection ; Machine generated text detection ; Knowledge distillation ; Triplet lossΠερίληψη
Detecting machine generated text is increasingly important as Large Language Models (LLMs) evolve rapidly. In practice, detectors often fail to generalize Out-of-Distribution (OOD), degrading under domain shifts, topic changes, unseen generators, and paraphrasing attacks. This thesis studies whether compact machine generated text detectors can retain stronger OOD robustness through teacher-student training. The student is optimized with supervised cross-entropy, optional logit-based knowledge distillation via temperature-scaled KL divergence, and teacher-guided representation alignment using triplet loss and supervised contrastive learning. Experiments on the MAGE benchmark follow its OOD testbed protocol across unseen domain, unseen model, combined shift, and paraphrasing settings. To support deployment-oriented evaluation, performance is reported both before and after decision-threshold calibration. The results show that triplet-based teacher guidance is the strongest distillation strategy among the distilled variants, with the best final model combining cross-entropy, knowledge distillation, and teacher-guided triplet alignment. Overall, the proposed distilled detector is competitive with the MAGE Longformer baseline on standard OOD settings and achieves substantially lower inference latency, yielding a lightweight and practically efficient detector for robust machine generated text detection beyond the training distribution.


