Matched via arXiv identifier search
- Stars
- 0
- Last push
- May 25, 2026 (4d ago)
Risk flags
- No tagged releases
- No Docker setup
- Low confidence match
David Števaňák, Marek Šuppa
Core AI workload signals detected from paper context and implementation/artifact evidence.
Keyphrase extraction for morphologically rich, low-resource languages remains understudied, largely due to the scarcity of suitable evaluation datasets. We address this gap for Slovak by constructing a dataset of 227,432 scientific abstracts with author-assigned keyphrases -- scraped and systematically cleaned from the Slovak Central Register of Theses -- representing a 25-fold increase over the largest prior Slovak ...
resource and approaching the scale of established English benchmarks such as KP20K. Using this dataset, we benchmark three unsupervised baselines (YAKE, TextRank, KeyBERT with SlovakBERT embeddings) and evaluate KeyLLM, an LLM-based extraction method using GPT-3.5-turbo. Unsupervised baselines achieve at most 11.6\% exact-match $F1@6$, with a large gap to partial matching (up to 51.5\%), reflecting the difficulty of matching inflected surface forms to author-assigned keyphrases. KeyLLM narrows this exact--partial gap, producing keyphrases closer to the canonical forms assigned by authors, while manual evaluation on 100 documents ($κ= 0.61$) confirms that KeyLLM captures relevant concepts that automated exact matching underestimates. Our analysis identifies morphological mismatch as the dominant failure mode for statistical methods -- a finding relevant to other inflected languages. The dataset (https://huggingface.co/datasets/NaiveNeuron/SlovKE) and evaluation code (https://github.com/NaiveNeuron/SlovKE) are publicly available.
Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.
| Task | Dataset | Metric | Value | Source | Evidence refs |
|---|---|---|---|---|---|
| On SlovKE, unsupervised baselines reach at most 11.6% exact-match F1@6 but up to | Slovak by constructing | F1 | 11.6 | llm-grounded | No explicit refs |
| KeyLLM achieves an exact-match F1@6 of approximately 15.2 on SlovKE, substantial | SlovKE | F1 | 6 | llm-grounded | No explicit refs |
Keyphrase extraction for morphologically rich, low-resource languages remains understudied, largely due to the scarcity of suitable evaluation datasets.
No direct maintained repository implementation was found, but paper-linked Hugging Face artifacts are available.
Hardware Notes
Expect multi-day setup/compute for meaningful reproduction based on current guidance.
LLM evidence refs: paper.abstract, evidencePack.paperSections[id=paper_caption_3], evidencePack.paperSections[id=paper_table_1], evidencePack.paperSections[id=paper_caption_5], evidencePack.paperSections[id=paper_table_2], evidencePack.paperSections[id=paper_19], evidencePack.paperSections[id=paper_16], evidencePack.paperSections[id=paper_table_3], researcherSummary.benchmarkSnapshot[0], researcherSummary.benchmarkSnapshot[1], paper.title, summary.hasReliableImplementation
Evidence graph: 2 refs, 1 links.
Utility signals: depth 95/100, grounding 68/100, status medium.
Compare maintenance quality, reproducibility coverage, and evidence confidence before choosing a reproduction baseline.
Matched via arXiv identifier search
Risk flags
There is no verified maintained implementation yet. Use this baseline plan to decide whether to prototype now or defer.
Hardware requirements
No verified implementation available
No additional verified repositories beyond the primary recommendation.
These repositories had low-confidence matching signals and are hidden by default.
No trustworthy model matches right now.
Search models on Hugging FaceNo trustworthy demo spaces right now.
Search spaces on Hugging FaceTasks
Natural language processing
Methods
Transformer
Domains
Large Language Models
Evaluation & Human Feedback Data
Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.
Open in HFEPXExplore Similar Papers
Jump to Paper2Code search queries derived from this paper's research context.
Need human evaluators for your AI research? Scale annotation with expert AI Trainers.