Llama-3 Fine-Tuning Achieves 90% of GPT-4’s Performance at Lower Cost

catskill.news 14 July 2024

215 2 minutes read

Llama-3 Fine-Tuning Achieves 90% of GPT-4's Performance at Lower Cost

The success of Llama-3 has been remarkable, showcasing that open-source models are closing the gap with their closed-source counterparts, according to together.ai. By leveraging proprietary data, customers have been able to fine-tune smaller open-source software (OSS) models like Llama-3 to achieve higher accuracy than top-tier closed-source models.

Fine-Tuning Process

Together AI’s platform allows users to fine-tune Llama-3-8B on proprietary data, creating custom models that outperform larger OSS alternatives like Llama-3-70B and are comparable to leading closed-source models like GPT-4, all at a fraction of the cost. A detailed guide demonstrates how a fine-tuned Llama-3 8B model improved from 47% accuracy to 65%, surpassing Llama-3-70B’s 64% and nearing GPT-4’s 71% accuracy.

The fine-tuning process involves several steps, including dataset transformation, uploading and verifying datasets, starting a fine-tuning job, and running evaluations to compare the results. The initial step requires downloading the Math Instruct dataset from HuggingFace, cleaning it up, and transforming it into a JSONL file format suitable for Together’s platform.

Dataset Transformation

The transformation process involves loading the original JSON data, defining the Llama-3 prompt format, and converting the data into the correct format. This formatted dataset is then validated using Together’s SDK before being uploaded for fine-tuning.

Uploading and Fine-Tuning

Once the dataset is prepared, it is uploaded to Together AI via the Python SDK. The fine-tuning job is then created using the Llama-3-8B base model, specifying the dataset, number of epochs, and other parameters. Users can monitor the fine-tuning job through Together AI’s dashboard.

Evaluation and Results

After fine-tuning, the model’s performance is evaluated using 1000 math problems. The fine-tuned Llama-3-8B model’s accuracy is compared to the base Llama-3-8B, Llama-3-70B, and GPT-4. The fine-tuned model achieved a 65.2% accuracy, outperforming the base model’s 47.2% and Llama-3-70B’s 64.2%, and coming close to GPT-4’s 71.4% accuracy.

The results indicate that the fine-tuned Llama-3-8B model outperformed the base model by nearly 20%, surpassed the top OSS model Llama-3-70B, and achieved over 90% of GPT-4’s accuracy. Additionally, the fine-tuned model is faster, 50 times cheaper than GPT-4, and offers full ownership of the model and weights.

Conclusion

This fine-tuning approach demonstrates that small open-source models like Llama-3-8B can be customized to perform specific tasks with high accuracy, speed, and cost-efficiency. Users can leverage their proprietary data to fine-tune a model and either host it on Together AI or run it independently, maintaining full control and ownership.

The Llama-3-8B model trained on math problems outperformed leading OSS models and approached GPT-4’s performance, with a total fine-tuning cost of less than $100 on Together AI.

Image source: Shutterstock

Source link

catskill.news 14 July 2024

215 2 minutes read

Llama-3 Fine-Tuning Achieves 90% of GPT-4’s Performance at Lower Cost

Fine-Tuning Process

Dataset Transformation

Uploading and Fine-Tuning

Evaluation and Results

Conclusion

catskill.news

Rabby Wallet Integrates IOTA EVM to Enhance Multi-Chain DeFi Experience

MotoGP’s Gresini racing goes crypto with new fan sponsorship program

Intel officially unveils Lunar Lake, its Copilot+ AI PC chip

Weekly Meal Plan #78 | The Recipe Critic

“Observing the Credit Landscape: Unveiling the Five-Month Shield”

Russia’s war in Ukraine: Live updates – CNN

IN CANNES WITH THE ASTON MARTIN DB12

TIFFANY & CO. HARDWEAR EYEWEAR

Ikea Billy Bookcase Hack: The Saga of the “Built-In Bookshelves”

Fine-Tuning Process

Dataset Transformation

Uploading and Fine-Tuning

Evaluation and Results

Conclusion

catskill.news

J.D. Vance, Hawley Mix With Racists at Conservative Event

Bitcoin traders eye bear trap as BTC price hits 10-day high of $60.4K

Related Articles

NVIDIA Introduces Generative AI Models and NIM Microservices for OpenUSD

NVIDIA’s AI Masters Triumph in KDD Cup 2024 Data Science Competition

Sui Community Fights Scams with Sui Guardians Initiative

Mt. Gox Bitcoin Distribution Underway After a Decade-Long Legal Battle

Rabby Wallet Integrates IOTA EVM to Enhance Multi-Chain DeFi Experience

MotoGP’s Gresini racing goes crypto with new fan sponsorship program

Intel officially unveils Lunar Lake, its Copilot+ AI PC chip

Weekly Meal Plan #78 | The Recipe Critic

“Observing the Credit Landscape: Unveiling the Five-Month Shield”

Russia’s war in Ukraine: Live updates – CNN

IN CANNES WITH THE ASTON MARTIN DB12

TIFFANY & CO. HARDWEAR EYEWEAR

Ikea Billy Bookcase Hack: The Saga of the “Built-In Bookshelves”