
-
Louvre heist: five things to know about missing jewellery
-
Stock markets climb as China-US trade fears ease
-
Colombia recalls ambassador to US as Trump-Petro feud intensifies
-
Louvre stays closed as France hunts jewel thieves
-
UK lawmakers urge govt to strip Prince Andrew of his titles
-
US begins sending nuke workers home as shutdown drags
-
Dembele returns for PSG after six weeks out
-
Pope Leo holds first meeting with abuse survivors' group
-
'I probably have to change my behaviour', Flick says after red card
-
US envoys meet Israel's Netanyahu after Gaza violence
-
Three things we learned from the United States Grand Prix
-
To beat football violence, Brazilian clubs scan every fan
-
South Africa call up uncapped prop Porthen for November tour
-
Ireland wing Hansen out of All Blacks Test
-
Shares in French bank BNP Paribas plummet after US verdict
-
Internet services cut for hours by Amazon cloud outage
-
Pakistan punish sloppy South Africa to reach 259-5 in second Test
-
Tourists upset as Louvre stays shut after jewel heist
-
Maguire urges Man Utd to build on Liverpool triumph
-
Louvre jewel theft: latest in string of museum heists
-
Trial opens in Klarna's $8.3-bn lawsuit against Google
-
Stock markets rise as China-US trade fears ease
-
Slot seeks solutions as Liverpool crisis deepens
-
Amazon's cloud services hit by hours-long global outage
-
Pakistan ride luck to reach 177-3 in second South Africa Test
-
Dembele set for PSG return after six weeks out
-
US envoys in Israel to shore up Gaza plan
-
Cargo plane skids off Hong Kong runway, kills 2
-
Amazon's cloud services hit by global outage
-
China posts lacklustre Q3 economic data as key Beijing conclave starts
-
'People can breathe': hope for peace on Afghan-Pakistan border
-
Louvre closes for second day as France hunts jewel thieves
-
Japan coalition deal paves way for Takaichi to be first woman PM
-
England hammer New Zealand after Brook and Salt onslaught
-
Five things to know about Gaza's Rafah border crossing
-
Thyssenkrupp spins off warship unit to tap defence boom
-
Sweden names ex-Chelsea manager Graham Potter new coach
-
Kering shares jump on sale of beauty division to L'Oreal
-
10 South Koreans arrested, two rescued in Cambodia scam crackdown
-
Stock markets bounce back as China-US trade fears ease
-
Pakistan 95-1 at lunch in second South Africa Test
-
Bolivia's new president faces worst economic crisis in decades
-
Serious, popular, besties with Trump: Italy's Meloni marks three years
-
In the Sahel, no reprieve under jihadist blockade
-
One year on, Spain's flood survivors rebuild and remember
-
Cargo plane skids off Hong Kong runway, kills two
-
Myanmar junta says seized 30 Starlink receivers in scam centre raid
-
Japan set for new coalition and first woman PM
-
Toxic haze chokes Indian capital
-
Flood reckoning for Bali on overdevelopment, waste
RBGPF | -4.07% | 76 | $ | |
RYCEF | 0.07% | 15 | $ | |
BTI | -0.9% | 51.16 | $ | |
NGG | -0.52% | 76.55 | $ | |
GSK | 0.79% | 44.26 | $ | |
VOD | -0.53% | 11.608 | $ | |
RELX | 1.05% | 45.71 | $ | |
BP | -0.14% | 33.084 | $ | |
SCS | 0.42% | 16.62 | $ | |
CMSC | 0.02% | 24.105 | $ | |
RIO | 1.58% | 69.11 | $ | |
AZN | -0.41% | 84.34 | $ | |
BCE | -1.68% | 23.86 | $ | |
CMSD | 0.61% | 24.439 | $ | |
JRI | 0.94% | 13.901 | $ | |
BCC | -0.16% | 70.915 | $ |

AI in Compliance Moves From Hype to Results - Revealing Clear Advances in Latest-generation Models
New benchmark by EQS Group and the BCM evaluates six leading AI models across 120 real-world compliance scenarios
MUNICH, DE / ACCESS Newswire / October 20, 2025 / Artificial intelligence is rapidly entering corporate workflows - but not all models deliver equally. To assess how well AI can handle the realities of compliance, the new 'EQS Benchmark Report: AI Performance in Compliance & Ethics' tested six leading AI models with 120 real-world compliance scenarios - from risk assessments and conflict-of-interest evaluations to third-party screening. The results: near-perfect precision on structured tasks such as classification and decision-making, with accuracy rates above 95%, but steep drops when complexity or ambiguity increases. Produced in collaboration with the German association Berufsverband der Compliance Manager e.V. (BCM), the benchmark also highlights the pace of progress, with 2025 models significantly outperforming those from 2024.
"For many compliance practitioners, AI is still unfamiliar territory," said Moritz Homann, Director of Product Innovation and AI at EQS Group. "Understanding how to apply it effectively and what it can be trusted with can be difficult - especially in a field as sensitive as compliance, where accuracy, accountability, and integrity are non-negotiable."
"AI can offer compliance new levels of insight, but our responsibility is to ensure its use stays within clear ethical and legal boundaries," said Dr. Gisa Ortwein, President of BCM. "Initiatives like this benchmark help us distinguish between what AI can genuinely deliver and where human judgment remains irreplaceable. That is how we safeguard integrity while embracing innovation - ensuring AI adoption enhances, rather than undermines, our profession."
The EQS benchmark is the first to assess AI performance in the compliance domain, using tasks that reflect day-to-day responsibilities of compliance and ethics professionals. It measures model accuracy, reliability, and practical usefulness across structured, semi-structured, and open-ended tasks.
Latest models significantly outperform those released only months earlier
The benchmark results highlight how quickly model capabilities are evolving. Google's Gemini 2.5 Pro achieved the highest overall score at 86.7%, demonstrating robust performance across all task types and compliance areas. With an overall score of 86.5%, OpenAI's GPT-5 (ChatGPT's default model since August 2025) matched Gemini in most categories, underscoring how quickly model capabilities are converging at the top. GPT-5 performed particularly well on open-ended content creation, while Gemini led in complex analytical and decision-making tasks.
OpenAI's o3 followed with a performance of 83.3%, illustrating both the progress of GPT-5 over its predecessor and the fast iteration cycle shaping the field. Anthropic's Claude Opus 4.1 reached a score of 81.5%, underperforming in structured evaluations and analytical reasoning, while GPT-4o (72.9%) and Mistral Large 2 (70.1%) ranked last. This reflects the significant generational leap between models released in 2024 and those launched in 2025.
In compliance, AI excels with clear rules, but struggles when ambiguity rises
Overall, AI models delivered their strongest results on straightforward, structured compliance tasks. For example, performance averaged 90.8% in decision-making scenarios based on a defined situation and a set of rules or policies. In exercises involving matching or mapping data sets, models reached an average score of 91.8%, with four of six models exceeding 95%.
By contrast, performance on more complex tasks varied more widely between models. For tasks involving data analysis, the spread was particularly large - a 60-point difference between the best and worst performers. In this category, Gemini 2.5 Pro achieved an 88% score, followed by GPT-5 with 62% - while GPT-4o ranked lowest with only 28%.
Open-ended tasks - such as drafting executive briefings or reports on internal investigations - proved more challenging even for the most recent models. The best performer in this category, GPT-5, reached a score of 67.4%. Unlike structured tasks, these assignments were evaluated by a human jury.
"There are some high-stakes tasks compliance professionals would not fully outsource to AI - nor should they," said Moritz Homann. "The strength of AI tools lies in acting as a force multiplier, supporting compliance workflows while leaving ultimate responsibility and judgment with professionals. Even for highly complex tasks, AI can take on much of the groundwork, saving valuable time on routine preparation and allowing experts to focus where their judgment is indispensable."
High consistency and low hallucination rate
The benchmark also tested reliability by repeating multiple-choice tasks three times per model. Consistency was high, with most models returning the same result in more than 95% of cases. Hallucinations - one of the most criticized risks of AI - were rare: across all tasks and models, only three clear instances were recorded, amounting to a rate of just 0.71%. This indicates that when tasks are clearly defined and contextualized, today's models can deliver stable and fact-based results in compliance scenarios. However, since hallucinations cannot be entirely ruled out, human oversight remains essential - especially for sensitive content with regulatory implications.
Model selection and prompt design influence outcomes
The report also highlights the importance of prompt specificity. In tasks where AI models were asked to extract red flags from third-party screening data, results varied depending on how narrowly the question was framed - for instance, whether to include affiliated entities or rate the severity of findings. Newer models - GPT-5 and Gemini 2.5 Pro - showed a better ability to follow complex instructions and return structured outputs, offering a clear advantage for compliance teams working with nuanced policies and large datasets.
Moritz Homann: "AI is here to stay - and the way we implement and use it today will shape its role in the compliance field for years to come. Compliance and ethics teams should not only govern AI risks, but also apply the technology themselves. Only by working hands-on with AI can we gain the insight to ask the right questions, design effective guardrails, and build trust. Our goal is to support this journey with practical tools, transparency, and dialogue."
The full EQS AI Benchmark Report is available to download here: https://www.eqs.com/compliance-wpapers/ai-performance-compliance-ethics-eqs/
Methodology
The EQS AI Benchmark Report tested six large language models - OpenAI's GPT-5, GPT-4o, and o3; Google's Gemini 2.5 Pro; Anthropic's Claude Opus 4.1; and Mistral Large 2 - across 120 tasks representing ten core compliance domains. These included areas such as risk assessment, speak-up case review, training effectiveness, policy evaluation, and regulatory gap analysis.
The task set was designed with input from compliance professionals and includes both real-world and synthetic content, such as HR datasets, training results, and policy texts. Some tasks had an objectively correct answer, while some required a more subjective and human-centered approach to scoring. For this reason, open-ended outputs were assessed with the support of the Berufsverband der Compliance Manager (BCM), whose members contributed professional evaluation and feedback on the quality and usefulness of model-generated responses.
Press contact
Christina Jahn
Tel.: +49 89 444430133
E-Mail: [email protected]
About EQS Group
EQS Group is a leading international cloud provider for compliance & ethics, data privacy, sustainability management, and investor relations. More than 14,000 companies across the world use EQS Group's products to build trust by reliably and securely meeting complex regulatory requirements, minimizing risks and transparently reporting on business performance and its impact on society and the environment.
EQS Group's solutions are bundled in a cloud-based platform. This allows compliance processes for whistleblower protection and case handling, policy management, and approval processes to be managed just as professionally as business partners, third parties and risks, insider lists and reporting obligations. In addition, EQS Group provides software to fulfill human rights due diligence requirements across corporate supply chains, ensure compliance with data privacy regulations like GDPR and EU AI Act, and support efficient ESG management and compliant sustainability reporting. Listed companies also benefit from a global newswire, investor targeting and contact management, as well as IR websites and webcasts for efficient and secure investor communication.
EQS Group was founded in Munich in 2000. Today, the group employs around 600 professionals worldwide.
About the BCM
As the leading professional association exclusively for in-house compliance officers from companies, associations, and other organizations, the BCM represents the interests of its members in dealings with policymakers, business, and society. The BCM focuses on providing information, fostering networks, and strengthening the compliance profession. It offers a wide range of free services designed to keep members informed about current compliance issues and to promote and continuously develop knowledge-sharing within its network.
SOURCE: EQS Group GmbH
View the original press release on ACCESS Newswire
L.Mason--AMWN