Search
Results
Optimum
[https://huggingface.co/docs/optimum/index] - - public:mzimmerm
Optimum is an extension of Transformers that provides a set of performance optimization tools to train and run models on targeted hardware with maximum efficiency. It is also the repository of small, mini, tiny models.
(1) Most cost effective GPU for local LLMs? : LocalLLaMA
[https://www.reddit.com/r/LocalLLaMA/comments/12vxxze/most_cost_effective_gpu_for_local_llms/] - - public:mzimmerm
GGML quantized models. They would let you leverage CPU and system RAM, instead of having to rely on a GPU’s. This could save you a fortune, especially if go for some used AMD Epyc platforms. This could be more viable for the larger models, especially the 30B/65B parameters models which would still press or exceed the VRAM on the P40.
Optimizing LLMs for Speed and Memory
7 steps to master large language models (LLMs) | Data Science Dojo
Up to date List of LLM Models
[https://docs.google.com/spreadsheets/d/1kT4or6b0Fedd-W_jMwYpb63e1ZR3aePczz3zlbJW-Y4/edit#gid=741531996] - - public:mzimmerm
Replit — How to train your own Large Language Models
[https://blog.replit.com/llm-training] - - public:mzimmerm
Hi level only talk about training for a language
How to train a new language model from scratch using Transformers and Tokenizers
[https://huggingface.co/blog/how-to-train] - - public:mzimmerm
Describes how to train a new language (desperanto) model.