Fine tine Mistral 7B with Autotrain easily

Make sure we have required dependencies:

Setup Autotrain

Short overview of what the command flags do.

!autotrain: Command executed in environments like a Jupyter notebook to run shell commands directly. autotrain is an automatic training utility.
llm: A sub-command or argument specifying the type of task
--train: Initiates the training process.
--project_name: Sets the name of the project
--model abhishek/llama-2-7b-hf-small-shards: Specifies original model that is hosted on Hugging Face named "llama-2-7b-hf-small-shards" under the "abhishek".
--data_path .: The path to the dataset for training. The "." refers to the current directory. The train.csv file needs to be located in this directory.
--use_int4: Use of INT4 quantization to reduce model size and speed up inference times at the cost of some precision.
--learning_rate 2e-4: Sets the learning rate for training to 0.0002.
--train_batch_size 12: Sets the batch size for training to 12.
--num_train_epochs 3: The training process will iterate over the dataset 3 times.

Go to the !autotrain code cell below and update it by following the steps below:

After --project_name replace enter-a-project-name with the name that you'd like to call the project
After --repo_id replace username/repository. Replace username with your Hugging Face username and repository with the repository name you'd like it to be created under. You don't need to create this repository before hand, it will automatically be created and uploaded once the training is completed.
Confirm that train.csv is in the root directory in the Colab. The --data_path . flag will make it so that AutoTrain looks for your data there.
Make sure to add the LoRA Target Modules to be trained --target-modules q_proj, v_proj
Once you've made these changes you're all set, run the command below!

Install needed dependencies:

Import libraries:

Initialize config:

Create model:

Create PEFT model:

Setup tokenizer:

Run Generation:

Checkout the notebook here

To see an example of model inference click here

You can also checkout a video walk through by 1littlecoder that this post was inspired by.