Matched via arXiv identifier search
- Stars
- 0
- Last push
- Apr 24, 2026 (1d ago)
Risk flags
- No tagged releases
- No Docker setup
- Dependency manifest missing
Hieu Man, Van-Cuong Pham, Nghia Trung Ngo, Franck Dernoncourt, Thien Huu Nguyen
Learning robust representations of authorial style is crucial for authorship attribution and AI-generated text detection. However, existing methods often struggle with content-style entanglement, where models learn spurious correlations between authors' writing styles and topics, leading to poor generalization across domains. To address this challenge, we propose Explainable Authorship Variational Autoencoder (EAVAE) ...
, a novel framework that explicitly disentangles style from content through architectural separation-by-design. EAVAE first pretrains style encoders using supervised contrastive learning on diverse authorship data, then finetunes with a Variational Autoencoder (VEA) architecture using separate encoders for style and content representations. Disentanglement is enforced through a novel discriminator that not only distinguishes whether pairs of style/content representations belong to the same or different authors/content sources, but also generates natural language explanation for their decision, simultaneously mitigating confounding information and enhancing interpretability. Extensive experiments demonstrate the effectiveness of EAVAE. On authorship attribution, we achieve state-of-the-art performance on various datasets, including Amazon Reviews, PAN21, and HRS. For AI-generated text detection, EAVAE excels in few-shot learning over the M4 dataset. Code and data repositories are available online\footnote{https://github.com/hieum98/avae} \footnote{https://huggingface.co/collections/Hieuman/document-level-authorship-datasets}.
Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.
| Task | Dataset | Metric | Value | Source | Evidence refs |
|---|---|---|---|---|---|
| Generalizable Authorship Attribution Era Generative Ai | HRS | MRR | 4 | paper-derived | No explicit refs |
Learning robust representations of authorial style is crucial for authorship attribution and AI-generated text detection.
No direct maintained repository implementation was found, but paper-linked Hugging Face artifacts are available.
Hardware Notes
Expect multi-day setup/compute for meaningful reproduction based on current guidance.
Evidence graph: 3 refs, 2 links.
Utility signals: depth 95/100, grounding 78/100, status high.
Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.
Matched via arXiv identifier search
Risk flags
There is no verified maintained implementation yet. Use this baseline plan to decide whether to prototype now or defer.
Hardware requirements
No verified implementation available
No additional verified repositories beyond the primary recommendation.
These repositories had low-confidence matching signals and are hidden by default.
No trustworthy dataset matches right now.
Search datasets on Hugging FaceNo trustworthy demo spaces right now.
Search spaces on Hugging FaceTasks
Generalizable Authorship Attribution Era Generative Ai
Methods
None detected
Domains
None detected
Evaluation & Human Feedback Data
Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.
Open in HFEPXExplore Similar Papers
Jump to Paper2Code search queries derived from this paper's research context.
Need human evaluators for your AI research? Scale annotation with expert AI Trainers.