I want to know the tools and methods you use for the observability and monitoring of your ML (LLM) performance and responses in production.

  • kennysongB
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 months ago

    If you’re open to using an open source library, you can use LangCheck to monitor and visualize text quality metrics in production.

    For example, you can compute & plot toxicity of users prompts and LLM responses from your logs. (A very simple example here.)

    (Disclaimer: I’m one of the contributors of LangCheck)