Search
Results
(2) Are there any tiny (1-3b) models finetuned for coding available in GGUF format? : LocalLLaMA
[https://www.reddit.com/r/LocalLLaMA/comments/16csdq6/are_there_any_tiny_13b_models_finetuned_for/] - - public:mzimmerm
bigcode (BigCode)
[https://huggingface.co/bigcode] - - public:mzimmerm
Research community developing various code models, small and big. Models may not be instruct
WizardLM (WizardLM)
deepseek-ai/deepseek-coder-6.7b-instruct · Hugging Face
[https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct] - - public:mzimmerm
Another possible model. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.
LLaMA 7B GPU Memory Requirement - Transformers - Hugging Face Forums
[https://discuss.huggingface.co/t/llama-7b-gpu-memory-requirement/34323/6] - - public:mzimmerm
With the optimizers of bitsandbytes (like 8 bit AdamW), you would need 2 bytes per parameter, or 14 GB of GPU memory.
stabilityai/stable-code-3b · Hugging Face
[https://huggingface.co/stabilityai/stable-code-3b] - - public:mzimmerm
Another potential model to use for Newspeak, but it is NOT open source. Adventage: 2.5B params, so should be usable in small GPUs
Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face
[https://huggingface.co/blog/mixtral] - - public:mzimmerm
The Mixtral model is new, and seems to be good. Click on “Demo“ to test it
StarCoder: A State-of-the-Art LLM for Code
[https://huggingface.co/blog/starcoder] - - public:mzimmerm
Article has comparison with other code-LLM models
codellama (Code Llama) - Huggingface model for generating programs. Maybe can be used for Newspeak?
BigCode - Playground - a Hugging Face Space by bigcode
[https://huggingface.co/spaces/bigcode/bigcode-playground] - - public:mzimmerm
Look for models that could be used in Newspeak