Day 113 – Just some thoughts on LLM files

What are theses things called Parameters?

Are they sentences? Are they words? Are they something else?

What actually is a language model file?

I was under the impression that LLMs were single files, but I realised today that I hadn’t actually double checked this.

Then I wondered, is there a really simple example for how to build your own tiny, tiny LLM … and even if it wasn’t predicting words correctly yet, would be good to know how that works.

So, what are my thoughts on this:

I am assuming some sort of tool (maybe PyTorch?) is used for building this?
How would I prepare the original data set of files

There’s the Olama package, which lets you use any LM

Are they singular files?
Are there different types or formats of LLM files?
Are these LLMs stored in RAM memory, and/or GPU memory?
What resources (time, energy) does it take to train LLMs?
Who is training LLMs at the moment?
What does customising a model mean?

Moondream 2

On my travels today I discovered Moondream 2 … it was on the list of Llama models that I was reading through. Will look into this another time. It’s a micro LLM for vision.

Day 113 – Just some thoughts on LLM files

Moondream 2

Comments

Leave a Reply Cancel reply

More posts

Day 387 – Cursor Automations

Day 342 to Day 386 – New Beginnings

Day 341 – forge laravel installing pgvector with digital ocean

Day 330 – 340 Understanding Vectors, Embeddings, and LLMs: A Practical Guide