Search

Results

LLaMA 7B GPU Memory Requirement - Transformers - Hugging Face Forums

[https://discuss.huggingface.co/t/llama-7b-gpu-memory-requirement/34323/6] - 2024-03-04 10:10:38 - public:mzimmerm

ai, code, generate, llama, llm, model, newspeak, train - 8 | id:1489782 -

With the optimizers of bitsandbytes (like 8 bit AdamW), you would need 2 bytes per parameter, or 14 GB of GPU memory.

Large Language Models for Domain-Specific Language Generation: How to Train Your Dragon | by Andreas Mülder | Medium

[https://medium.com/@andreasmuelder/large-language-models-for-domain-specific-language-generation-how-to-train-your-dragon-0b5360e8ed76] - 2024-03-04 09:45:59 - public:mzimmerm

ai, article, code, doc, generate, llm, train - 7 | id:1489780 -

training a model like Llama with 2.7 billion parameters outperformed a larger model like Vicuna with 13 billion parameters. Especially when considering resource consumption, this might be a good alternative to using a 7B Foundation model instead of a full-blown ChatGPT. The best price-to-performance base model for our use case turned out to be Mistral 7b. The model is compact enough to fit into an affordable GPU with 24GB VRAM and outperforms the other models with 7B parameters.

Fine-tune a pretrained model

[https://huggingface.co/docs/transformers/training] - 2024-03-02 10:39:40 - public:mzimmerm

ai, bert, code, example, good, huggingface, llm, notebook, progress, train, train-bert-on-yelp, tutorial - 12 | id:1489730 -

yabs.io

Yet Another Bookmarks Service

Search

Results

LLaMA 7B GPU Memory Requirement - Transformers - Hugging Face Forums

Large Language Models for Domain-Specific Language Generation: How to Train Your Dragon | by Andreas Mülder | Medium

Fine-tune a pretrained model

Follow Tags

Export: