Skip to main content

Eclipse PanEval

<p>Eclipse PanEval provides a unified, vendor-neutral framework to evaluate AI models for capability, safety, and cybersecurity in line with EU regulations.</p><p>In-scope:<br>- A three-dimensional evaluation framework based on "Capacity – Task – Metrics"<br>- Coverage of 4 major model categories: language, multimodal, and speech models<br>- Support for several evaluation tasks including task solving, coding, multi-turn QA, factuality, image-text QA, depth estimation, speech perception, and more<br>- Safety &amp; robustness evaluation as a cross-cutting dimension across all model types<br>- Alignment with EU AI Act and CRA compliance requirements<br>- AI-assisted subjective evaluation to improve efficiency and objectivity<br>- Open leaderboard and evaluation platform (https://flageval.baai.ac.cn)</p><p>Out-of-scope:<br>- Model training or fine-tuning<br>- Deployment infrastructure for production AI systems<br>- Legal compliance certification (Eclipse PanEval provides evaluation tooling, not legal advice)</p>

Basics


Repositories

Repository Commits Reviews Issues
This project has no activity.

The EMO oversees the lifecycle of Eclipse projects, trademark and IP management, and provides a governance framework and recommendations on open source best practices.

See the project’s PMI page at https://projects.eclipse.org/projects/technology.paneval


Releases


Reviews


IP Lab requests

Security related information is not yet available for this project. In order to gather such information automatically, Self Service of GitHub resources needs to be enabled for this project.

Back to the top