Kaggle: Your Home for Data Science
Kaggle is like huggingface. They can run notebooks, and give GPU power to notebooks
Kaggle is like huggingface. They can run notebooks, and give GPU power to notebooks
Mini course of statistical foundations of ML
My account on Stability AI - it is just a link to huggingface
Comparison of efficiency of all LLM models on hugging face
Various methods to run LLM models locally hugging face is only one of them.
AMD seems to sell these accelerators, which are like video cards.
Train LLM on AMD APU. In this scenario, we’ll use an APU because most laptops with a Ryzen CPU include an iGPU; specifically, this post should work with iGPUs based on the “GCN 5.0” architecture, or “Vega” for friends. We’ll use an AMD Ryzen 2200G in this post, an entry-level processor equipped with 4C/4T and an integrated GPU.
UMA buffer size is the size of memory used by APU. It is set on the motherboard, often limited to 2GB. But LLM AI could use 16GB or more.
My Account for Motherboerd Asus PRIME X570-P, registered here.
GGML quantized models. They would let you leverage CPU and system RAM, instead of having to rely on a GPU’s. This could save you a fortune, especially if go for some used AMD Epyc platforms. This could be more viable for the larger models, especially the 30B/65B parameters models which would still press or exceed the VRAM on the P40.
Replit is a site where I can run any REPL online. Can be used for AI
High level how to train a model
The OSCAR project (Open Super-large Crawled Aggregated coRpus) is an Open Source project aiming to provide web-based multilingual resources and datasets for Machine Learning (ML) and Artificial Intelligence (AI) applications.
Dataset is just a zip of files
Research community developing various code models, small and big. Models may not be instruct
They have the 1.3B version!!! This may be the best to start with Newspeak. Should work train even on huggingcface
Another possible model. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.
With the optimizers of bitsandbytes (like 8 bit AdamW), you would need 2 bytes per parameter, or 14 GB of GPU memory.
Another potential model to use for Newspeak, but it is NOT open source. Adventage: 2.5B params, so should be usable in small GPUs
training a model like Llama with 2.7 billion parameters outperformed a larger model like Vicuna with 13 billion parameters. Especially when considering resource consumption, this might be a good alternative to using a 7B Foundation model instead of a full-blown ChatGPT. The best price-to-performance base model for our use case turned out to be Mistral 7b. The model is compact enough to fit into an affordable GPU with 24GB VRAM and outperforms the other models with 7B parameters.
Comparison of LLM models for coding
Online UI to Openchat. This seems really good, open source etc. It uses the LLama2 and Mistral models, according to https://github.com/imoneoi/openchat
Open source with lots of information. Uses Multiple undrelying models. Not sure how I would train for it
The Mixtral model is new, and seems to be good. Click on “Demo“ to test it
Article has comparison with other code-LLM models
Review of LLM specialized for code generation
List of LLM models on Wikipedia
Chat, models. Not open source, but instruct and relatively small (3B). The 3B instruct may be the best to try on Newspeak.
Chat on Mistral. Does well on Python and Smalltalk