Video Highlights: Fine Tune GPT-J 6B in Under 3 Hours on IPUs

Print Friendly, PDF & Email

Did you know you can run GPT-J 6B on Graphcore IPU in the cloud? Following the now infamous leaked Google memo, there’s been a real storm in the AI world recently around smaller, open source language models, like GPT-J, that are cheaper and faster to fine-tune, run and perform just as well as larger models for many language tasks. 

Graphcore offers two pre-trained, ready-made GPT-J notebooks ready to try today on IPUs in Paperspace cloud for fine-tuning and inference:

Text entailment on IPU using GPT-J – Fine-tuning

Text generation on IPU using GPT-J – Inference

It takes around 2 hours 45 minutes to work through the GPT-J fine tuning notebook in its entirety on a 16-IPU platform – this will vary, of course, if you want to bring your own data for fine tuning, which is good to know to work out costs on Paperspace.

If you want to use GPT-J in production, contact Graphcore to find out their special pricing or to try it much faster on a 64-IPU system which is coming soon to Paperspace.

What is GPT-J? A powerful, efficient alternative to large language models (LLMs) such as GPT-3 and GPT-4 for many NLP tasks. Fine-tuning GPT-J lets you tailor the model for specific applications, using a task-relevant dataset.

In the video presentation below, Graphcore engineer Sofia Liguori walks through the process of fine-tuning GPT-J 6B on a Paperspace Gradient notebook (Google Colab alternative), powered by Graphcore IPUs. Run the Paperspace Gradient Notebook – Fine tuning: text entailment on IPU using GPT-J. You can also use GPT-J for text generation (inference) on Paperspace. Run the Paperspace Gradient Notebook – Inference: text generation on IPU using GPT-J. Read more about the the benefits of GPT-J in the company’s blog Fine-tune GPT-J: effective GPT-4 alternative for many NLP tasks.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter:

Join us on LinkedIn:

Join us on Facebook:

Speak Your Mind