Out-of-distribution detection of machine generated text

Kampouridis, Prodromos; Καμπουρίδης, Πρόδρομος

Master Thesis

Συγγραφέας

Kampouridis, Prodromos

Καμπουρίδης, Πρόδρομος

Ημερομηνία

2026-02

Περίληψη

Detecting machine generated text is increasingly important as Large Language Models (LLMs) evolve rapidly. In practice, detectors often fail to generalize Out-of-Distribution (OOD), degrading under domain shifts, topic changes, unseen generators, and paraphrasing attacks. This thesis studies whether compact machine generated text detectors can retain stronger OOD robustness through teacher-student training. The student is optimized with supervised cross-entropy, optional logit-based knowledge distillation via temperature-scaled KL divergence, and teacher-guided representation alignment using triplet loss and supervised contrastive learning. Experiments on the MAGE benchmark follow its OOD testbed protocol across unseen domain, unseen model, combined shift, and paraphrasing settings. To support deployment-oriented evaluation, performance is reported both before and after decision-threshold calibration. The results show that triplet-based teacher guidance is the strongest distillation strategy among the distilled variants, with the best final model combining cross-entropy, knowledge distillation, and teacher-guided triplet alignment. Overall, the proposed distilled detector is competitive with the MAGE Longformer baseline on standard OOD settings and achieves substantially lower inference latency, yielding a lightweight and practically efficient detector for robust machine generated text detection beyond the training distribution.

Τίτλος Προγράμματος Μεταπτυχιακών Σπουδών

Τεχνητή Νοημοσύνη - Artificial Intelligence

Τμήμα

Σχολή Τεχνολογιών Πληροφορικής και Επικοινωνιών. Τμήμα Ψηφιακών Συστημάτων

Συνεργαζόμενο Ίδρυμα

National Center of Scientific Research "Demokritos"

Αριθμός σελίδων

Γλώσσα

Αγγλικά

URI

https://dione.lib.unipi.gr/xmlui/handle/unipi/19043

Συλλογή

Τμήμα Ψηφιακών Συστημάτων

Εμφάνιση πλήρους εγγραφής

Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα

Εκτός από όπου διευκρινίζεται διαφορετικά, το τεκμήριο διανέμεται με την ακόλουθη άδεια:
Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα