Hey! I'm Arthu1.

#1998

by arthu1 - opened 1 day ago

Hello! I've released a new from scratch model series called North 1 (North Star 1, Wind Arc 1.5 and DUMB lowk North Air 1.)
If you would like to, quantize my models!

If you don't know who I am, I'm the owner of starlight mini

arthu1

1 day ago

Also, its arthu1/wind-arc-1.5 or north-star-1.

RichardErkhov

1 day ago

hey arthu1, wanted to let you know that you need to create a config file for the model, so that llama cpp could work with it. Also not sure if your model could be quantized using llama cpp , as it is completely new model from scratch and llama cpp might not recognize it and will refuse to quantize, and therefore you might need to create a pull request to support your model type, and hopefully it will get merged into the main branch.

arthu1/wind-arc-1.5: no architectures entry (malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)")

aifeifei798

about 21 hours ago

•

edited about 21 hours ago

Subject: Proposal to support Hugging Face `transformers` format for North Star / Wind Arc models

Hi,

I've been following the North Star model family (North Air, North Star, and Wind Arc). These are impressive small-scale reasoning models!

To make these models more accessible to the community (enabling use with tools like vLLM, Ollama, bitsandbytes quantization, and the transformers library), it would be great to provide a version in the standard Hugging Face format.

Since the models use a SentencePiece tokenizer and a Transformer-based architecture (likely Llama-like), they can be easily wrapped into a LlamaForCausalLM structure.

Below is a suggested conversion script based on your current .pt checkpoint structure.

Conversion Script (Python)

import torch
import os
from transformers import LlamaConfig, LlamaForCausalLM, LlamaTokenizer

def convert_to_hf(pt_path="windarc15.pt", tokenizer_path="tokenizer.model", save_dir="./windarc-1.5-hf"):
    print(f"Loading checkpoint from {pt_path}...")
    checkpoint = torch.load(pt_path, map_location="cpu", weights_only=False)
    cfg = checkpoint["cfg"]
    state_dict = checkpoint["model"]

    # 1. Map your custom config to LlamaConfig
    # Adjust the keys (e.g., 'dim', 'n_layers') based on your exact cfg dictionary
    config = LlamaConfig(
        vocab_size=32000,
        hidden_size=cfg.get("dim", 896),
        intermediate_size=cfg.get("hidden_dim", 2432),
        num_hidden_layers=cfg.get("n_layers", 32),
        num_attention_heads=cfg.get("n_heads", 14),
        num_key_value_heads=cfg.get("n_kv_heads", 2), # If GQA is used
        max_position_embeddings=cfg.get("max_seq_len", 2048),
        rms_norm_eps=1e-5,
    )

    # 2. Initialize HF Model
    model = LlamaForCausalLM(config)

    # 3. Map State Dict Keys
    # This maps custom keys to HF Llama keys
    hf_state_dict = {}
    mapping = {
        "tok_embeddings.weight": "model.embed_tokens.weight",
        "norm.weight": "model.norm.weight",
        "output.weight": "lm_head.weight",
    }

    for k, v in state_dict.items():
        if k in mapping:
            hf_state_dict[mapping[k]] = v
        elif k.startswith("layers."):
            new_k = k.replace("layers.", "model.layers.")
            # Map attention and feed-forward layers
            new_k = new_k.replace(".attention.wq.", ".self_attn.q_proj.")
            new_k = new_k.replace(".attention.wk.", ".self_attn.k_proj.")
            new_k = new_k.replace(".attention.wv.", ".self_attn.v_proj.")
            new_k = new_k.replace(".attention.wo.", ".self_attn.o_proj.")
            new_k = new_k.replace(".feed_forward.w1.", ".mlp.gate_proj.")
            new_k = new_k.replace(".feed_forward.w2.", ".mlp.down_proj.")
            new_k = new_k.replace(".feed_forward.w3.", ".mlp.up_proj.")
            new_k = new_k.replace(".attention_norm.", ".input_layernorm.")
            new_k = new_k.replace(".ffn_norm.", ".post_attention_layernorm.")
            hf_state_dict[new_k] = v
        else:
            hf_state_dict[k] = v

    # 4. Load weights into the HF model
    model.load_state_dict(hf_state_dict, strict=True)

    # 5. Save Model and Tokenizer
    print(f"Saving HF model to {save_dir}...")
    model.save_pretrained(save_dir)
    
    # Since it's a SentencePiece model, LlamaTokenizer handles it natively
    tokenizer = LlamaTokenizer(vocab_file=tokenizer_path)
    tokenizer.save_pretrained(save_dir)
    print("Success!")

if __name__ == "__main__":
    convert_to_hf()

Why this is helpful:

Easy Inference: Users can load the model with just two lines of code: AutoModelForCausalLM.from_pretrained("arthu1/windarc-1.5-hf").
Ecosystem Support: It allows the model to be instantly compatible with inference engines like vLLM, Text-Generation-WebUI, and local runners like LM Studio.
Quantization: It becomes easy to create 4-bit (GGUF/EXL2) versions for edge devices.

Looking forward to seeing more updates on North Star!

Best regards,
aifeifei798

arthu1

about 8 hours ago

Hey! I'm actually on a Mac mini (no CUDA) and if you'd like to use the models on-demand, email me at Arthur.schannel.stop@gmail.com and I can host the server at some point!
Thanks, Arthur!

arthu1

about 8 hours ago

hey arthu1, wanted to let you know that you need to create a config file for the model, so that llama cpp could work with it. Also not sure if your model could be quantized using llama cpp , as it is completely new model from scratch and llama cpp might not recognize it and will refuse to quantize, and therefore you might need to create a pull request to support your model type, and hopefully it will get merged into the main branch.

arthu1/wind-arc-1.5: no architectures entry (malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)")

Hey! Im Arthur. So I was thinking to make a CLI tool (Gem) where you can host north-star architecture models on device or on a Silicon, (custom machine, lasts 4 hrs). Gems are custom spaces (similar to hf) where you can host models, so no need for llama.cpp.

If you want early access to Wind-Arc-1.6 or North-I, please email me.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Hey! I'm Arthu1.

Subject: Proposal to support Hugging Face transformers format for North Star / Wind Arc models

Conversion Script (Python)

Why this is helpful:

Subject: Proposal to support Hugging Face `transformers` format for North Star / Wind Arc models