Fine tine Mistral 7B with Autotrain easily
Make sure we have required dependencies:
Setup Autotrain
Download dataset
Import dataset
Overview of AutoTrain command
Short overview of what the command flags do.
- !autotrain: Command executed in environments like a Jupyter notebook to run shell commands directly. autotrain is an automatic training utility.
- llm: A sub-command or argument specifying the type of task
- --train: Initiates the training process.
- --project_name: Sets the name of the project
- --model abhishek/llama-2-7b-hf-small-shards: Specifies original model that is hosted on Hugging Face named "llama-2-7b-hf-small-shards" under the "abhishek".
- --data_path .: The path to the dataset for training. The "." refers to the current directory. The train.csv file needs to be located in this directory.
- --use_int4: Use of INT4 quantization to reduce model size and speed up inference times at the cost of some precision.
- --learning_rate 2e-4: Sets the learning rate for training to 0.0002.
- --train_batch_size 12: Sets the batch size for training to 12.
- --num_train_epochs 3: The training process will iterate over the dataset 3 times.
Steps needed before running
Go to the !autotrain code cell below and update it by following the steps below:
- After --project_name replace enter-a-project-name with the name that you'd like to call the project
- After --repo_id replace username/repository. Replace username with your Hugging Face username and repository with the repository name you'd like it to be created under. You don't need to create this repository before hand, it will automatically be created and uploaded once the training is completed.
- Confirm that train.csv is in the root directory in the Colab. The --data_path . flag will make it so that AutoTrain looks for your data there.
- Make sure to add the LoRA Target Modules to be trained --target-modules q_proj, v_proj
- Once you've made these changes you're all set, run the command below!
Run Inference
Install needed dependencies:
Import libraries:
Initialize config:
Create model:
Create PEFT model:
Setup tokenizer:
Run Generation:
Checkout the notebook here
To see an example of model inference click here
You can also checkout a video walk through by 1littlecoder that this post was inspired by.