NVIDIA’s NV-Embed Model Achieves Top Spot on MTEB Leaderboard

catskill.news 11 June 2024

227 1 minute read

NVIDIA's NV-Embed Model Achieves Top Spot on MTEB Leaderboard

NVIDIA’s latest embedding model, NV-Embed, has set a new record for embedding accuracy with a score of 69.32 on the Massive Text Embedding Benchmark (MTEB), which encompasses 56 diverse embedding tasks, according to NVIDIA Technical Blog.

Understanding the Metrics for Embedding Models

Embedding models are evaluated using several metrics, with Normalized Discounted Cumulative Gain (NDCG) and Recall being the most significant. NDCG is a rank-aware metric that measures the relevance and order of retrieved information, while Recall is a rank-agnostic metric that measures the percentage of relevant results retrieved. Most benchmarks report NDCG@10, but for enterprise-grade retrieval-augmented generation (RAG) pipelines, Recall@5 is often recommended.

What are MTEB and BEIR?

The MTEB benchmark covers 56 tasks, including retrieval, classification, re-ranking, clustering, and summarization. BEIR focuses on retrieval tasks and adds complexity with different question types and domains, such as fact-checking and biomedical questions. MTEB is largely a superset of BEIR, making it a comprehensive benchmark for evaluating embedding models.

NV-Embed’s performance on MTEB has been exceptional, achieving an NDCG@10 score of 69.32, the highest among all models tested. This performance is attributed to several key improvements in the model’s architecture and training process.

Key Improvements in NV-Embed

Latent Attention Layer: This new layer simplifies the process of combining the mathematical representation (embeddings) of a series of words, improving the model’s efficiency and accuracy.
Two-Stage Learning Process: The first stage uses in-batch negative and hard negative pairs for contrastive learning, while the second stage blends data from non-retrieval tasks for further training, enhancing the model’s robustness.

These advancements make NV-Embed a powerful tool for enterprise retrieval workloads, although its effectiveness depends on the nature and domain of the data being used.

Prototyping with NV-Embed

NV-Embed is available through NVIDIA’s API catalog, allowing organizations to integrate this high-performing model into their data processing pipelines. Additionally, the NVIDIA NeMo Retriever collection of microservices enables seamless connection of custom models to diverse business data, delivering highly accurate responses.

For further details, visit the NVIDIA Technical Blog.

Image source: Shutterstock

. . .

NVIDIA’s NV-Embed Model Achieves Top Spot on MTEB Leaderboard

Understanding the Metrics for Embedding Models

What are MTEB and BEIR?

Key Improvements in NV-Embed

Prototyping with NV-Embed

Tags

catskill.news

Press release: Optum State Government and North Carolina – Optum

‘We have to get rid of the folks who are in the way’ — Senators speak at Bitcoin 2024

Boeing’s Starliner overcomes last-second problems to dock with the ISS

“Observing the Credit Landscape: Unveiling the Five-Month Shield”

Russia’s war in Ukraine: Live updates – CNN

IN CANNES WITH THE ASTON MARTIN DB12

TIFFANY & CO. HARDWEAR EYEWEAR

Ikea Billy Bookcase Hack: The Saga of the “Built-In Bookshelves”

Texas ‘still very strong’ candidate for 5-star WR Ryan Wingo

Understanding the Metrics for Embedding Models

What are MTEB and BEIR?

Key Improvements in NV-Embed

Prototyping with NV-Embed

Tags

catskill.news

BTC price risks $60K dive as Bitcoin bid liquidity thins on new 3% dip

What's Your Summer Reading? | Teaching American History

Related Articles

NVIDIA Introduces Generative AI Models and NIM Microservices for OpenUSD

NVIDIA’s AI Masters Triumph in KDD Cup 2024 Data Science Competition

Sui Community Fights Scams with Sui Guardians Initiative

Mt. Gox Bitcoin Distribution Underway After a Decade-Long Legal Battle

Press release: Optum State Government and North Carolina – Optum

‘We have to get rid of the folks who are in the way’ — Senators speak at Bitcoin 2024

Boeing’s Starliner overcomes last-second problems to dock with the ISS

“Observing the Credit Landscape: Unveiling the Five-Month Shield”

Russia’s war in Ukraine: Live updates – CNN

IN CANNES WITH THE ASTON MARTIN DB12

TIFFANY & CO. HARDWEAR EYEWEAR

Ikea Billy Bookcase Hack: The Saga of the “Built-In Bookshelves”

Texas ‘still very strong’ candidate for 5-star WR Ryan Wingo