Day 113 – Just some thoughts on LLM files

What are theses things called Parameters?

Are they sentences? Are they words? Are they something else?

What actually is a language model file?

I was under the impression that LLMs were single files, but I realised today that I hadn’t actually double checked this.

Then I wondered, is there a really simple example for how to build your own tiny, tiny LLM … and even if it wasn’t predicting words correctly yet, would be good to know how that works.

So, what are my thoughts on this:

  • I am assuming some sort of tool (maybe PyTorch?) is used for building this?
  • How would I prepare the original data set of files

There’s the Olama package, which lets you use any LM

  • Are they singular files?
  • Are there different types or formats of LLM files?
  • Are these LLMs stored in RAM memory, and/or GPU memory?
  • What resources (time, energy) does it take to train LLMs?
  • Who is training LLMs at the moment?
  • What does customising a model mean?

Moondream 2

On my travels today I discovered Moondream 2 … it was on the list of Llama models that I was reading through. Will look into this another time. It’s a micro LLM for vision.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *