Hi! I wanted to share LangCheck, an open source toolkit to evaluate LLM applications (GitHub, Quickstart).
It already supports English and Japanese text, and more languages soon – contributions welcome!
Core functionality:
langcheck.metrics
– metrics to evaluate quality & structure of LLM-generated textlangcheck.plot
– interactive visualizations of text qualitylangcheck.augment
– text augmentations to perturb prompts, references, etc (coming soon)
Super open to feedback & curious how other people think about evaluation for LLM apps.
You must log in or register to comment.