WisdomInterrogatory (LuWen): An Open-Source Legal Large Language Model Technical Report
Yiquan Wu, Yuhang Liu, Yifei Liu, Ang Li, Siying Zhou, Kun Kuang, Fei Wu
Core AI workload signals detected from paper context and implementation/artifact evidence.
Large language models have demonstrated remarkable capabilities across a wide range of natural language processing tasks, yet their application in the legal domain remains challenging due to the specialized terminology, complex reasoning requirements, and rapidly evolving legal knowledge involved. In this paper, we present WisdomInterrogatory (LuWen), an open-source Chinese legal language model built upon the Baichua ...
n foundation model through three key techniques: continual pre-training on a large-scale legal corpus, supervised fine-tuning with carefully curated legal instruction data, and retrieval-augmented generation integrated with a comprehensive legal knowledge base. We evaluate LuWen on five representative legal tasks spanning both prediction and generation settings, including legal judgment prediction, judicial examination, legal text summarization, law article question answering, and judicial decision reasoning. Experimental results show that LuWen outperforms several strong baselines, demonstrating the effectiveness of our approach in adapting general-purpose language models to the legal domain.
Results & Benchmarks
Benchmark evidence drill-down
Audit each benchmark finding before selecting an implementation path. Evidence refs map to the disclosure section below.
| Task | Dataset | Metric | Value | Source | Evidence refs |
|---|---|---|---|---|---|
| Summarization | GPT-3.5 | Law Article Question Answering | 38.0 | paper-derived | No explicit refs |
| Summarization | Baichuan+SFT | Law Article Question Answering | 44.0 | paper-derived | No explicit refs |
| Summarization | LuWen | Law Article Question Answering | 84.0 | paper-derived | No explicit refs |
Large language models have demonstrated remarkable capabilities across a wide range of natural language processing tasks, yet their application in the legal domain remains challenging due to the specialized terminology, complex reasoning requirements, and rapidly evolving legal knowledge involved.
Implementation Evidence Summary
zhihaiLLM/wisdomInterrogatory is the closest maintained adjacent implementation (Matched via arXiv identifier search). It is not paper-verified; validate algorithm and evaluation setup against the paper before trusting reported metrics. Community adoption signal: 545 GitHub stars.
Reproduction Risks
- Adjacent implementations are not paper-verified
- Recommended repository is adjacent and not paper-verified.
- Adjacent implementation match confidence is low.
Hardware Notes
Expect multi-day setup/compute for meaningful reproduction based on current guidance.
Evidence disclosure
Evidence graph: 3 refs, 3 links.
Utility signals: depth 100/100, grounding 85/100, status high.
Implementation Status
There is no verified maintained implementation yet. Use this baseline plan to decide whether to prototype now or defer.
- No maintained paper-verified implementation was found; start with the closest related repositories below.
- Compare repo methods against the paper equations/algorithm before trusting metrics.
- Create a minimal baseline implementation from the paper and use adjacent repos as references.
Reproduction readiness
Hardware requirements
- Expect multi-day setup/compute for meaningful reproduction based on current guidance.
No verified implementation available
- · No maintained repository has been identified for this paper. Check adjacent implementations or HF artifacts below.
Closest related implementations
These are not paper-verified. Use them as reference points when no direct implementation is available.
- zhihaiLLM/wisdomInterrogatoryAdjacentConfidence: LowStars: 545
Matched via arXiv identifier search
Hugging Face artifacts
No trustworthy direct or curated related Hugging Face artifacts were found yet.
Continue with targeted Hugging Face searches derived from the paper title and method context:
Tip: start with models, then check datasets/spaces if you need evaluation data or demos.
Direct artifact matches are currently sparse. Use targeted Hugging Face searches to quickly locate candidate models, datasets, and demos.
Research context
Tasks
Instruction tuning, Retrieval / indexing
Methods
Transformer, Retrieval-augmented generation
Domains
Natural Language Processing, Large Language Models, Information Retrieval
Evaluation & Human Feedback Data
Open this paper in HFEPX to review benchmark signals, evaluation modes, and human-feedback protocol context.
Open in HFEPXExplore Similar Papers
Jump to Paper2Code search queries derived from this paper's research context.
Need human evaluators for your AI research? Scale annotation with expert AI Trainers.