Meta Partners with Together AI to Launch High-Performance Llama 3.1 Models

catskill.news 24 July 2024

125 2 minutes read

Meta Partners with Together AI to Launch High-Performance Llama 3.1 Models

Meta has partnered with Together AI to unveil the Llama 3.1 models, marking a significant milestone in the open-source AI landscape. The release includes the Llama 3.1 405B, 70B, 8B, and LlamaGuard models, all of which are now available for inference and fine-tuning through Together AI’s platform. This collaboration aims to deliver accelerated performance while maintaining full accuracy, according to Together AI.

Unmatched Performance and Scalability

The Together Inference Platform promises horizontal scalability with industry-leading performance metrics. The Llama 3.1 405B model can process up to 80 tokens per second, while the 8B model can handle up to 400 tokens per second. This represents a speed improvement of 1.9x to 4.5x over vLLM, all while maintaining full accuracy.

These advancements are built on Together AI’s proprietary inference optimization research, incorporating technologies like FlashAttention-3 kernels and custom-built speculators based on RedPajama. The platform supports both serverless and dedicated endpoints, offering flexibility for developers and enterprises to build generative AI applications at production scale.

Broad Adoption and Use Cases

Over 100,000 developers and companies, including Zomato, DuckDuckGo, and the Washington Post, are already leveraging the Together Platform for their generative AI needs. The Llama 3.1 models offer unmatched flexibility and control, making them suitable for a range of applications from general knowledge tasks to multilingual translation and tool use.

The Llama 3.1 405B model, in particular, stands out as the largest openly available foundation model, rivaling the best closed-source alternatives. It includes advanced features like synthetic data generation and model distillation, which are expected to accelerate the adoption of open-source AI.

Advanced Features and Tools

The Together Inference Engine also includes LlamaGuard, a moderation model that can be used as a standalone classifier or as a filter to safeguard responses. This feature allows developers to screen for potentially unsafe content, enhancing the safety and reliability of AI applications.

The Llama 3.1 models also expand context length to 128K and add support for eight languages. These enhancements, along with new security and safety tools, make the models highly versatile and suitable for a wide range of applications.

Available Through API and Dedicated Endpoints

All Llama 3.1 models are accessible via the Together API, and the 405B model is available for QLoRA fine-tuning, allowing enterprises to tailor the models to their specific needs. The Together Turbo endpoints offer best-in-class throughput and accuracy, making them the most cost-effective solution for building with Llama 3.1 at scale.

Future Prospects

The partnership between Meta and Together AI aims to democratize access to high-performance AI models, fostering innovation and collaboration within the AI community. The open-source nature of the Llama 3.1 models aligns with Together AI’s vision of open research and trust between researchers, developers, and enterprises.

As the launch partner for the Llama 3.1 models, Together AI is committed to providing the best performance, accuracy, and cost-efficiency for generative AI workloads, ensuring that developers and enterprises can keep their data and models secure.

Image source: Shutterstock

Source link

catskill.news 24 July 2024

125 2 minutes read

Meta Partners with Together AI to Launch High-Performance Llama 3.1 Models

Unmatched Performance and Scalability

Broad Adoption and Use Cases

Advanced Features and Tools

Available Through API and Dedicated Endpoints

Future Prospects

catskill.news

Sean Diddy Combs’s Indictment: A Federal Prosecutor’s View

Press Release: Burlington Sock Puppets announced as new team name – MLB.com

Decentralized ID is the next ‘killer’ Web3 use case: Cardano sustainability lead

Autoport selects Moore County for NC’s first luxury driving resort – Greater Fayetteville Business Journal

“Observing the Credit Landscape: Unveiling the Five-Month Shield”

Russia’s war in Ukraine: Live updates – CNN

IN CANNES WITH THE ASTON MARTIN DB12

TIFFANY & CO. HARDWEAR EYEWEAR

Ikea Billy Bookcase Hack: The Saga of the “Built-In Bookshelves”

Unmatched Performance and Scalability

Broad Adoption and Use Cases

Advanced Features and Tools

Available Through API and Dedicated Endpoints

Future Prospects

catskill.news

NCSHP Participates in 11th Annual America's Best-Looking Cruiser Contest - NC DPS (.gov)

Google parent Alphabet profit surges 29% in Q2 amid AI splurge

Related Articles

NVIDIA Introduces Generative AI Models and NIM Microservices for OpenUSD

NVIDIA’s AI Masters Triumph in KDD Cup 2024 Data Science Competition

Sui Community Fights Scams with Sui Guardians Initiative

Mt. Gox Bitcoin Distribution Underway After a Decade-Long Legal Battle

Sean Diddy Combs’s Indictment: A Federal Prosecutor’s View

Press Release: Burlington Sock Puppets announced as new team name – MLB.com

Decentralized ID is the next ‘killer’ Web3 use case: Cardano sustainability lead

Autoport selects Moore County for NC’s first luxury driving resort – Greater Fayetteville Business Journal

“Observing the Credit Landscape: Unveiling the Five-Month Shield”

Russia’s war in Ukraine: Live updates – CNN

IN CANNES WITH THE ASTON MARTIN DB12

TIFFANY & CO. HARDWEAR EYEWEAR

Ikea Billy Bookcase Hack: The Saga of the “Built-In Bookshelves”