Another possible model. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.

LLaMA 7B GPU Memory Requirement - Transformers - Hugging Face Forums

[https://discuss.huggingface.co/t/llama-7b-gpu-memory-requirement/34323/6] - 2024-03-04 10:10:38 - public:mzimmerm

ai, code, generate, llama, llm, model, newspeak, train - 8 | id:1489782 -

With the optimizers of bitsandbytes (like 8 bit AdamW), you would need 2 bytes per parameter, or 14 GB of GPU memory.

stabilityai/stable-code-3b · Hugging Face

[https://huggingface.co/stabilityai/stable-code-3b] - 2024-03-04 10:05:36 - public:mzimmerm

ai, code, generate, llm, model, newspeak - 6 | id:1489781 -

Another potential model to use for Newspeak, but it is NOT open source. Adventage: 2.5B params, so should be usable in small GPUs

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

[https://huggingface.co/blog/mixtral] - 2024-03-04 08:24:33 - public:mzimmerm

ai, code, generate, huggingface, llm, mixtral, model, newspeak - 8 | id:1489774 -

The Mixtral model is new, and seems to be good. Click on “Demo“ to test it

StarCoder: A State-of-the-Art LLM for Code

[https://huggingface.co/blog/starcoder] - 2024-03-04 07:43:17 - public:mzimmerm

ai, code, generate, good, huggingface, llm, model, newspeak - 8 | id:1489773 -

Article has comparison with other code-LLM models

codellama (Code Llama) - Huggingface model for generating programs. Maybe can be used for Newspeak?

[https://huggingface.co/codellama] - 2024-03-03 08:48:06 - public:mzimmerm

ai, code, generate, huggingface, language, llama, model, newspeak, program - 9 | id:1489750 -

BigCode - Playground - a Hugging Face Space by bigcode

[https://huggingface.co/spaces/bigcode/bigcode-playground] - 2023-12-10 00:38:55 - public:mzimmerm

ai, bigcode, code, generate, good, model, newspeak, playground, software, starcoder - 10 | id:1485780 -

yabs.io

Yet Another Bookmarks Service

Search

Results

(2) Are there any tiny (1-3b) models finetuned for coding available in GGUF format? : LocalLLaMA

bigcode (BigCode)

WizardLM (WizardLM)

deepseek-ai/deepseek-coder-6.7b-instruct · Hugging Face