yabs.io

Yet Another Bookmarks Service

Viewing mzimmerm's Bookmarks

ai delete ,

[https://huggingface.co/docs/optimum/index] - - public:mzimmerm
ai, doc, huggingface, llm, model, optimum, repo, small, transformer - 9 | id:1489894 -

Optimum is an extension of Transformers that provides a set of performance optimization tools to train and run models on targeted hardware with maximum efficiency. It is also the repository of small, mini, tiny models.

[https://medium.com/@rafaelmanzanom/ditching-cuda-for-amd-rocm-for-more-accessible-llm-inference-ryzen-apus-edition-92c3649f8f7d] - - public:mzimmerm
ai, amd, apu, compile, gfx902, install, pytorch, rocm - 8 | id:1489810 -

Train LLM on AMD APU. In this scenario, we’ll use an APU because most laptops with a Ryzen CPU include an iGPU; specifically, this post should work with iGPUs based on the “GCN 5.0” architecture, or “Vega” for friends. We’ll use an AMD Ryzen 2200G in this post, an entry-level processor equipped with 4C/4T and an integrated GPU.

[https://www.reddit.com/r/LocalLLaMA/comments/12vxxze/most_cost_effective_gpu_for_local_llms/] - - public:mzimmerm
ai, doc, llm, model, optimize, perform - 6 | id:1489804 -

GGML quantized models. They would let you leverage CPU and system RAM, instead of having to rely on a GPU’s. This could save you a fortune, especially if go for some used AMD Epyc platforms. This could be more viable for the larger models, especially the 30B/65B parameters models which would still press or exceed the VRAM on the P40.

With marked bookmarks
| (+) | |

Viewing 1 - 50, 50 links out of 135 links, page: 1

Follow Tags

Manage

Export:

JSONXMLRSS