Gru.ai Ranks First in OpenAI's Latest SWE-Bench Verified Evaluation

SAN FRANCISCO, Sept. 9, 2024 /PRNewswire/ — On 3rd September, Gru.ai ranked first with a high score of 45.2% in the latest data released by SWE-Bench Verified Evaluation, the authoritative standard for AI model evaluation. The SWE-Bench Verified, a reliable evaluation of AI models’ ability to solve real-world software issues, was a Benchmark of collaboration between OpenAI and SWE.

Bug Fix Gru, one of four agents provided by Gru, participated in the SWE-Bench Verified evaluation. According to Gru’s team blog, providing Bug Fix Gru with a comprehensive operating environment and a wealth of development tools laid the foundation for their high score. Enhancements in workflow, multimodal support, and the addition of Rag capabilities effectively boosted the score. Notably, the Gru team emphasized that they have an evaluation process in place to assess the impact of any changes.

Gru.ai, a company that builds AI developers, provides four types of software engineering agents:

Assistant Gru: Helps users solve standalone technical issues, which is now in public use. Test Gru: Generates unit test code automatically Bug Fix Gru: Fixes bugs based on user issues automatically Babel Gru: Assists in building end-to-end projects

Gru.ai previously secured a $5.5 million angel investment. Alongside Gru, several other firms in the sector, including Devin, Factory, Cosine.sh and Codium.ai, have also announced their funding details. As large-scale model capabilities mature, the coding agent field is experiencing a surge of investment and innovation, indicating a bright future for this evolving industry.

Source : Gru.ai Ranks First in OpenAI's Latest SWE-Bench Verified Evaluation

This content was prepared by our news partner, Cision PR Newswire. The opinions and the content published on this page are the author’s own and do not necessarily reflect the views of Siam News Network

Thailand plans to launch virtual banking services by 2025

Thailand’s central bank keeps interest rate at 2.50 percent

Thailand leads mobile banking penetration

Thailand rolls out 10,000 THB digital wallet initiative

BOT to review Minimum credit card debt Payment rate

Chinese Tourists Cancel Thai Vacations Amid Safety Concerns Following Actor’s Kidnapping

Thailand Approves Controversial Legislation to Legalize Casinos and Gambling

Exploring Surin and Si Sa Ket: Thailand’s Cultural and Natural Wonders

Concerns Grow Over Tourism in Thailand Amid Calls to Safeguard Chinese Visitors After Missing Actor’s Rescue

Thailand in January 2025: A Month of Festivals and Cultural Wonders

Gru.ai Ranks First in OpenAI's Latest SWE-Bench Verified Evaluation

Must Read

About Us

Popular Category

Editor Picks

NEW YORK STOCK EXCHANGE PRE-MARKET UPDATE DECEMBER 16, 2024

Gru.ai Ranks First in OpenAI's Latest SWE-Bench Verified Evaluation

Must Read

Related Posts

About Us

Popular Category

Editor Picks