LivingTheCode.Life
  • Data Structures
  • Algorithms
  • Machine Learning
  • Categories
  • GitHub
GitHub
Twitter
Created By LifeOfCoding
Find me on:
Articles:
Vehicle Price Prediction
Fine-tune Mistral 7B with autotrain
Refine Image Generation with refinement models
SDLX ControlNet Image Shape Shifting
Dealing with missing values in Datasets
Fine-tune Llama2 on Guanaco Dataset with Autotrain
Fine-tune Llama2 with GPT Generated Dataset
Pandas Helper Functions
Pandas Grouping & Sorting Helpers
Intro to Pandas
Fine-tune Llama2 with GPT Generated Dataset
Jimmy Rousseau
Author: Jimmy Rousseau | Published: 8/25/2023
Machine Learning

Fine tune Llama 2 with QLora using PEFT and SFT

This shows you how to use ChatGPT to generate a synthetic dataset to train Llama2 on, Checkout the Colab Notbook Here from Matt Shumer

Data Generation

Write your prompt here. Make it as descriptive as possible!

Then, choose the temperature (between 0 and 1) to use when generating data. Lower values are great for precise tasks, like writing code, whereas larger values are better for creative tasks, like writing stories.

Finally, choose how many examples you want to generate. The more you generate, a) the longer it takes and b) the more expensive data generation will be. But generally, more examples will lead to a higher-quality model. 100 is usually the minimum to start.

Run this to generate the dataset.

We also need to generate a system message

Now let's put our examples into a dataframe and turn them into a final pair of datasets.

Split into train and test sets.

Install necessary libraries

Define Hyperparameters

Load Datasets and Train