Fine-Tune Qwen3 14B with Axolotl

Built with Axolotl

Axolotl is the most performant LLM post-training framework available, delivering faster training with efficient, consistent and stable performance. Train your workload and ship your product 30% faster; saving you both time and money.

Installation

Axolotl is easy to install from pip, or use our pre-built Docker images for a hassle free dependency experience. See our docs for more information.

%%capture
# This step can take ~5-10 minutes to install dependencies
!pip install --no-build-isolation axolotl[flash-attn]>=0.9.1
!pip install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@78b2a45713a54c9bedf8b33f5e31cf07a1a57154"

Demo: Talk Like a Pirate

In this demo, we are training the model to respond like a pirate. This was chosen as a way to easily show how to train a model to respond in a certain style of your choosing (without being prompted) and is quite easy to validate within the scope of a Colab.

Upload your own dataset or use a Huggingface dataset

You can choose to use your own JSONL file from your own Google Drive; for example downloading the Pirate-Ultrachat JSONL to your Google Drive. JSONL datasets should be formatted similar to the OpenAI dataset format.

You can also simply use the winglian/pirate-ultrachat-10k dataset directly.

# Default to HF dataset location
dataset_id = "winglian/pirate-ultrachat-10k"
uploaded = {}
import os
# Optionally, upload your own JSONL to your Google Drive
GOOGLE_DRIVE_PATH = ""  # ex: "MyDrive/Colab\ Notebooks/train.jsonl"

# "Select All" permissions, or you may get the error:
# "MessageError: Error: credential propagation was unsuccessful"
if GOOGLE_DRIVE_PATH:
    from google.colab import drive
    # Mount your Google Drive
    GOOGLE_DRIVE_MNT = "/content/drive/"
    drive.mount(GOOGLE_DRIVE_MNT, force_remount=True)
    tmp_path = os.path.join(GOOGLE_DRIVE_MNT, GOOGLE_DRIVE_PATH.lstrip("/"))
    # make sure file exists
    if not os.path.isfile(tmp_path):
        raise ValueError(f"File {tmp_path} does not exist")
    dataset_id = tmp_path

Configure for Supervised Fine-Tuning (SFT)

from axolotl.utils.dict import DictDefault
from axolotl.cli.config import load_cfg

# Axolotl provides full control and transparency over model and training configuration
config = DictDefault(
    base_model = "Qwen/Qwen3-14B",  # Use the instruct tuned model, but we're aligning it to be a pirate
    load_in_4bit = True,  # set to True for qLoRA
    adapter = "qlora",
    lora_r = 32,
    lora_alpha = 64,
    lora_target_modules = [
        "q_proj", "k_proj", "v_proj", "o_proj",  # train self_attn linear modules
        "gate_proj", "down_proj", "up_proj",  # train MLP linear modules
    ],
    lora_qkv_kernel = True,  # optimized triton kernels for LoRA
    lora_o_kernel = True,
    lora_mlp_kernel = True,
    embeddings_skip_upcast = True,  # keep embeddings in fp16 so the model fits in 15GB VRAM
    xformers_attention = True,  # use xformers on Colab w/ T4 for memory efficient attention, flash_attention only on Ampere or above
    plugins = [
        # more efficient training using Apple's Cut Cross Entropy; https://github.com/apple/ml-cross-entropy
        "axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin",
    ],
    sample_packing = True,  # 2-6x increase in tokens per micro-batch
    # when using packing, use a slightly higher learning rate to account for fewer steps
    # alternatively, reduce the micro_batch_size + gradient_accumulation_steps to achieve closer to the same number of steps/epoch
    learning_rate = 0.00019,
    sequence_len = 4096,  # larger sequence length improves packing efficiency for more tokens/sec
    micro_batch_size = 1,
    gradient_accumulation_steps = 1,
    gradient_checkpointing = True,  # tradeoff reduced VRAM for increased time
    gradient_checkpointing_kwargs = {
        "use_reentrant": False,
    },
    optimizer = "paged_adamw_8bit",
    lr_scheduler = "cosine",
    warmup_steps = 5,
    fp16 = True,  # use float16 + automatic mixed precision, bfloat16 not supported on Colab w/ T4
    bf16 = False,
    max_grad_norm = 0.1,  # gradient clipping
    num_epochs = 1,
    saves_per_epoch = 2,  # how many checkpoints to save over one epoch
    logging_steps = 1,
    output_dir = "./outputs/qwen-sft-pirate-rrr",
    chat_template = "qwen3",
    datasets = [
        {
            "path": dataset_id,  # Huggingface Dataset id or path to train.jsonl
            "type": "chat_template",
            "split": "train",
            "eot_tokens": ["<|im_end|>"],
        }
    ],
    dataloader_prefetch_factor = 8,  # dataloader optimizations
    dataloader_num_workers = 2,
    dataloader_pin_memory = True,
  )

# validates the configuration
cfg = load_cfg(config)
[2025-05-08 13:40:27,488] [INFO] [root.register:348] [PID:174] Attempting to load plugin: axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin

[2025-05-08 13:40:27,493] [INFO] [root.register:351] [PID:174] Plugin loaded successfully: axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin

[2025-05-08 13:40:27,959] [INFO] [axolotl.utils.schemas.config.check_eval_packing:721] [PID:174] [RANK:0] explicitly setting `eval_sample_packing` to match `sample_packing`

[2025-05-08 13:40:27,960] [INFO] [axolotl.utils.schemas.config.hint_sample_packing_padding:514] [PID:174] [RANK:0] Setting `pad_to_sequence_len: true` to prevent memory leaks when sample_packing

[2025-05-08 13:40:27,961] [INFO] [axolotl.utils.schemas.config.check_bf16:1251] [PID:174] [RANK:0] bf16 support detected, but not enabled for this configuration.
[2025-05-08 13:40:28,590] [INFO] [axolotl.normalize_config:237] [PID:174] [RANK:0] cuda memory usage baseline: 0.000GB (+0.002GB cache, +0.359GB misc)
from axolotl.utils import patch_optimized_env
# speedup downloads from HF 🤗 and set "PYTORCH_CUDA_ALLOC_CONF" env to save memory
patch_optimized_env()

Datasets

Axolotl has a robust suite of loaders and transforms to parse most open datasets of any format into the appropriate chat template for your model. Axolotl will mask input tokens from the user’s prompt so that the train loss is only calculated against the model’s response. For more information, see our documentation on dataset preparation.

from axolotl.common.datasets import load_datasets

# Load, parse and tokenize the datasets to be formatted with qwen3 chat template
# Drop long samples from the dataset that overflow the max sequence length
dataset_meta = load_datasets(cfg=cfg)
[2025-05-08 13:41:00,844] [DEBUG] [axolotl.utils.models.load_tokenizer:441] [PID:174] [RANK:0] EOS: 151645 / <|im_end|>

[2025-05-08 13:41:00,845] [DEBUG] [axolotl.utils.models.load_tokenizer:442] [PID:174] [RANK:0] BOS: None / None

[2025-05-08 13:41:00,846] [DEBUG] [axolotl.utils.models.load_tokenizer:443] [PID:174] [RANK:0] PAD: 151643 / <|endoftext|>

[2025-05-08 13:41:00,847] [DEBUG] [axolotl.utils.models.load_tokenizer:444] [PID:174] [RANK:0] UNK: None / None

[2025-05-08 13:41:00,869] [INFO] [axolotl.utils.data.sft.load_tokenized_prepared_datasets:271] [PID:174] [RANK:0] Unable to find prepared dataset in last_run_prepared/97037817611d38b3a9c681753c3c4c95

[2025-05-08 13:41:00,870] [INFO] [axolotl.utils.data.sft.load_tokenized_prepared_datasets:272] [PID:174] [RANK:0] Loading raw datasets...

[2025-05-08 13:41:00,870] [WARNING] [axolotl.utils.data.sft.load_tokenized_prepared_datasets:274] [PID:174] [RANK:0] Processing datasets during training can lead to VRAM instability. Please pre-process your dataset.

[2025-05-08 13:41:00,871] [INFO] [axolotl.utils.data.sft.load_tokenized_prepared_datasets:281] [PID:174] [RANK:0] No seed provided, using default seed of 42
[2025-05-08 13:41:04,196] [INFO] [axolotl.utils.data.sft.get_dataset_wrapper:484] [PID:174] [RANK:0] Loading dataset with base_type: chat_template and prompt_style: None

[2025-05-08 13:41:04,233] [INFO] [axolotl.__call__:761] [PID:174] [RANK:0] Using chat template:

---

{%- if tools %}

    {{- '<|im_start|>system\n' }}

    {%- if messages[0].role == 'system' %}

        {{- messages[0].content + '\n\n' }}

    {%- endif %}

    {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}

    {%- for tool in tools %}

        {{- "\n" }}

        {{- tool | tojson }}

    {%- endfor %}

    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}

{%- else %}

    {%- if messages[0].role == 'system' %}

        {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}

    {%- endif %}

{%- endif %}

{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}

{%- for message in messages[::-1] %}

    {%- set index = (messages|length - 1) - loop.index0 %}

    {%- if ns.multi_step_tool and message.role == "user" and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}

        {%- set ns.multi_step_tool = false %}

        {%- set ns.last_query_index = index %}

    {%- endif %}

{%- endfor %}

{%- for message in messages %}

    {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}

        {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}

    {%- elif message.role == "assistant" %}

        {%- set content = message.content %}

        {%- set reasoning_content = '' %}

        {%- if message.reasoning_content is defined and message.reasoning_content is not none %}

            {%- set reasoning_content = message.reasoning_content %}

        {%- else %}

            {%- if '</think>' in message.content %}

                {%- set content = message.content.split('</think>')[-1].lstrip('\n') %}

                {%- set reasoning_content = message.content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}

            {%- endif %}

        {%- endif %}

        {%- if loop.index0 > ns.last_query_index %}

            {%- if loop.last or (not loop.last and reasoning_content) %}

                {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}

            {%- else %}

                {{- '<|im_start|>' + message.role + '\n' + content }}

            {%- endif %}

        {%- else %}

            {{- '<|im_start|>' + message.role + '\n' + content }}

        {%- endif %}

        {%- if message.tool_calls %}

            {%- for tool_call in message.tool_calls %}

                {%- if (loop.first and content) or (not loop.first) %}

                    {{- '\n' }}

                {%- endif %}

                {%- if tool_call.function %}

                    {%- set tool_call = tool_call.function %}

                {%- endif %}

                {{- '<tool_call>\n{"name": "' }}

                {{- tool_call.name }}

                {{- '", "arguments": ' }}

                {%- if tool_call.arguments is string %}

                    {{- tool_call.arguments }}

                {%- else %}

                    {{- tool_call.arguments | tojson }}

                {%- endif %}

                {{- '}\n</tool_call>' }}

            {%- endfor %}

        {%- endif %}

        {{- '<|im_end|>\n' }}

    {%- elif message.role == "tool" %}

        {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}

            {{- '<|im_start|>user' }}

        {%- endif %}

        {{- '\n<tool_response>\n' }}

        {{- message.content }}

        {{- '\n</tool_response>' }}

        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}

            {{- '<|im_end|>\n' }}

        {%- endif %}

    {%- endif %}

{%- endfor %}

{%- if add_generation_prompt %}

    {{- '<|im_start|>assistant\n' }}

    {%- if enable_thinking is defined and enable_thinking is false %}

        {{- '<think>\n\n</think>\n\n' }}

    {%- endif %}

{%- endif %}

---
[2025-05-08 13:42:09,195] [INFO] [axolotl.utils.data.utils.drop_long_seq_in_dataset:177] [PID:174] [RANK:0] min_input_len: 23

[2025-05-08 13:42:09,196] [INFO] [axolotl.utils.data.utils.drop_long_seq_in_dataset:179] [PID:174] [RANK:0] max_input_len: 3380
[2025-05-08 13:42:21,651] [INFO] [axolotl.utils.data.sft.load_tokenized_prepared_datasets:351] [PID:174] [RANK:0] Saving merged prepared dataset to disk... last_run_prepared/97037817611d38b3a9c681753c3c4c95
[2025-05-08 13:42:25,711] [INFO] [axolotl.utils.samplers.multipack.calc_min_len:411] [PID:174] [RANK:0] gather_len_batches: [1540]

[2025-05-08 13:42:25,714] [INFO] [axolotl.calc_sample_packing_eff_est:491] [PID:174] [RANK:0] sample_packing_eff_est across ranks: [0.9987832601968344]

Training

from axolotl.train import train

# just train the first 25 steps for demo.
# This is sufficient to align the model as we've used packing to maximize the trainable samples per step.
cfg.max_steps = 25
model, tokenizer, trainer = train(cfg=cfg, dataset_meta=dataset_meta)
     #@@ #@@      @@# @@#

    @@  @@          @@  @@           =@@#                               @@                 #@    =@@#.

    @@    #@@@@@@@@@    @@           #@#@=                              @@                 #@     .=@@

      #@@@@@@@@@@@@@@@@@            =@# @#     ##=     ##    =####=+    @@      =#####+  =#@@###.   @@

    @@@@@@@@@@/  +@@/  +@@          #@  =@=     #@=   @@   =@#+  +#@#   @@    =@#+  +#@#   #@.      @@

    @@@@@@@@@@  ##@@  ##@@         =@#   @#      =@# @#    @@      @@   @@    @@      #@   #@       @@

     @@@@@@@@@@@@@@@@@@@@          #@=+++#@=      =@@#     @@      @@   @@    @@      #@   #@       @@

                                  =@#=====@@     =@# @#    @@      @@   @@    @@      #@   #@       @@

    @@@@@@@@@@@@@@@@  @@@@        #@      #@=   #@=  +@@   #@#    =@#   @@.   =@#    =@#   #@.      @@

                                 =@#       @#  #@=     #@   =#@@@@#=    +#@@=  +#@@@@#=    .##@@+   @@

    @@@@  @@@@@@@@@@@@@@@@



[2025-05-07 22:08:14,344] [INFO] [axolotl.monkeypatch.peft.utils.patch_peft_prep_code:76] [PID:1336] [RANK:0] patching prepare_model_for_kbit_training to allow for overrides

[2025-05-07 22:08:14,549] [INFO] [axolotl.integrations.cut_cross_entropy.pre_model_load:80] [PID:1336] [RANK:0] Applying Cut Cross Entropy to model type: qwen3
[2025-05-07 22:09:49,798] [INFO] [accelerate.utils.modeling.get_balanced_memory:990] [PID:1336] We will use 90% of the memory on device 0 for storing the model, and 10% for the buffer to avoid OOM. You can set `max_memory` in to a higher value to use more memory (at your own risk).
[2025-05-07 22:11:37,521] [INFO] [axolotl.utils.models.load_model:1302] [PID:1336] [RANK:0] cuda memory usage after model load: 9.264GB (+1.721GB cache, +0.375GB misc)

[2025-05-07 22:11:37,532] [INFO] [axolotl.utils.models.prepare_model:1205] [PID:1336] [RANK:0] converting PEFT model w/ prepare_model_for_kbit_training

[2025-05-07 22:11:37,537] [INFO] [axolotl.utils.models.load_model:1341] [PID:1336] [RANK:0] Converting modules to torch.float16

trainable params: 128,450,560 || all params: 14,896,757,760 || trainable%: 0.8623

[2025-05-07 22:11:40,170] [INFO] [axolotl.utils.models.load_model:1402] [PID:1336] [RANK:0] cuda memory usage after adapters: 9.743GB (+1.476GB cache, +0.375GB misc)
/usr/local/lib/python3.11/dist-packages/axolotl/core/trainers/base.py:64: FutureWarning: `tokenizer` is deprecated and will be removed in version 5.0.0 for `AxolotlTrainer.__init__`. Use `processing_class` instead.
  super().__init__(*_args, **kwargs)
No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
[2025-05-07 22:11:41,755] [INFO] [axolotl.train.save_initial_configs:359] [PID:1336] [RANK:0] Pre-saving adapter config to ./outputs/qwen-sft-pirate-rrr...

[2025-05-07 22:11:41,756] [INFO] [axolotl.train.save_initial_configs:363] [PID:1336] [RANK:0] Pre-saving tokenizer to ./outputs/qwen-sft-pirate-rrr...

[2025-05-07 22:11:41,974] [INFO] [axolotl.train.save_initial_configs:366] [PID:1336] [RANK:0] Pre-saving model config to ./outputs/qwen-sft-pirate-rrr...

[2025-05-07 22:11:41,982] [INFO] [axolotl.train.execute_training:211] [PID:1336] [RANK:0] Starting trainer...

[2025-05-07 22:11:45,047] [INFO] [axolotl.utils.samplers.multipack.calc_min_len:411] [PID:1336] [RANK:0] gather_len_batches: [1540]
You're using a Qwen2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
You're using a Qwen2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
[25/25 09:25, Epoch 0/1]
Step Training Loss
1 1.092300
2 1.554200
3 1.041400
4 1.733800
5 1.430000
6 1.258500
7 1.343600
8 1.101700
9 1.086500
10 0.813200
11 0.689600
12 0.826700
13 1.541800
14 0.948000
15 1.357000
16 1.085800
17 1.516800
18 1.146800
19 0.834800
20 0.968000
21 1.388800
22 1.511500
23 1.338500
24 1.206600
25 1.504600

[2025-05-07 22:12:42,746] [INFO] [axolotl.callbacks.on_step_end:128] [PID:1336] [RANK:0] cuda memory usage while training: 9.768GB (+3.287GB cache, +0.646GB misc)

[2025-05-07 22:21:46,859] [INFO] [axolotl.train.save_trained_model:231] [PID:1336] [RANK:0] Training completed! Saving pre-trained model to ./outputs/qwen-sft-pirate-rrr.

Inferencing the trained model

import torch
from transformers import TextStreamer

messages = [
    {
        "role": "user",
        "content": "Explain the Pythagorean theorem to me.",
    },
]

prompt = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=False,
    enable_thinking = False,
)

outputs = model.generate(
    **tokenizer(prompt, return_tensors = "pt").to("cuda"),
    max_new_tokens = 192,
    temperature = 1.0, top_p = 0.8, top_k = 32,
    streamer = TextStreamer(tokenizer, skip_prompt = True),
)
Ahoy there, matey! Shiver me timbers, ye be lookin' for the Pythagorean theorem, eh? Well, hold yer horses and listen up, for I'll be tellin' ye all about it in me own special way.

The Pythagorean theorem be a real gem of a mathematical trick that helps ye find the length of a side of a right triangle. Now, a right triangle be a triangle with a right angle, which be that little corner that looks like a square. 

The theorem be named after a clever fellow named Pythagoras, who be a mathematician from ancient Greece. He discovered that if ye have a right triangle, the square of the length of the hypotenuse (that be the side opposite the right angle) be equal to the sum of the squares of the other two sides. 

In other words, if ye have a triangle with sides of length a, b, and c (

Saving your trained model

Axolotl automatically saves checkpoints to the output_dir path.

# Show the saved checkpoints in the output_dir
!ls -lh "./outputs/qwen-sft-pirate-rrr"
total 506M
-rw-r--r-- 1 root root  845 May  7 22:21 adapter_config.json
-rw-r--r-- 1 root root 491M May  7 22:21 adapter_model.safetensors
-rw-r--r-- 1 root root  707 May  7 22:11 added_tokens.json
drwxr-xr-x 2 root root 4.0K May  7 22:17 checkpoint-13
drwxr-xr-x 2 root root 4.0K May  7 22:21 checkpoint-25
-rw-r--r-- 1 root root 1.2K May  7 22:11 config.json
-rw-r--r-- 1 root root 1.6M May  7 22:11 merges.txt
-rw-r--r-- 1 root root 2.6K May  7 22:21 README.md
-rw-r--r-- 1 root root  613 May  7 22:11 special_tokens_map.json
-rw-r--r-- 1 root root 9.5K May  7 22:11 tokenizer_config.json
-rw-r--r-- 1 root root  11M May  7 22:11 tokenizer.json
-rw-r--r-- 1 root root 2.7M May  7 22:11 vocab.json

Setting hub_model_id: in the original config would have automatically uploaded the model to HuggingFace Hub (e.g. hub_model_id: username/model_id)

If you prefer to manually upload the training artifacts, we can still upload the entire final checkpoint to HuggingFace from the CLI.

from huggingface_hub import notebook_login
# remove the partial epoch checkpoints
!rm -rf "./outputs/qwen-sft-pirate-rrr/checkpoint-*"

# HF Notebook login widget
notebook_login()

# upload the LoRA adapter for your model to HF, remember to update the username/model-name below
!huggingface-cli upload --repo-type=model winglian/pirate-qwen-14B "./outputs/qwen-sft-pirate-rrr"
It seems you are trying to upload a large folder at once. This might take some time and then fail if the folder is too large. For such cases, it is recommended to upload in smaller batches or to use `HfApi().upload_large_folder(...)`/`huggingface-cli upload-large-folder` instead. For more details, check out https://huggingface.co/docs/huggingface_hub/main/en/guides/upload#upload-a-large-folder.
Start hashing 40 files.
Finished hashing 40 files.
Uploading files using Xet Storage..
Uploading...:  87% 1.82G/2.10G [00:23<00:04, 67.3MB/s]Cancellation requested; stopping current tasks.
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/_commit_api.py", line 598, in _upload_xet_files
    upload_files(
RuntimeError: Xet Runtime Error: Task cancelled; possible runtime shutdown in progress (task 9 was cancelled).

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/huggingface-cli", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/commands/huggingface_cli.py", line 57, in main
    service.run()
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/commands/upload.py", line 207, in run
    print(self._upload())
          ^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/commands/upload.py", line 302, in _upload
    return self.api.upload_folder(
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/hf_api.py", line 1633, in _inner
    return fn(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/hf_api.py", line 4942, in upload_folder
    commit_info = self.create_commit(
                  ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/hf_api.py", line 1633, in _inner
    return fn(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/hf_api.py", line 4202, in create_commit
    self.preupload_lfs_files(
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/hf_api.py", line 4483, in preupload_lfs_files
    _upload_xet_files(**upload_kwargs, create_pr=create_pr)  # type: ignore [arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/huggingface_hub/_commit_api.py", line 592, in _upload_xet_files
    with progress_cm as progress:
  File "/usr/local/lib/python3.11/dist-packages/tqdm/std.py", line 1138, in __exit__
    def __exit__(self, exc_type, exc_value, traceback):

KeyboardInterrupt
^C