Search
Results
mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. - mlabonne/llm-course
lyogavin/godmodeanimation: 2D Game Animation in God Mode
How Language Models Work
nomic-ai/gpt4all: GPT4All: Chat with Local LLMs on Any Device
How LLMs Work, Explained Without Math - miguelgrinberg.com
abi/secret-llama: Fully private LLM chatbot that runs entirely with a browser with no server needed. Supports Mistral and LLama 3.
Introducing DBRX: A New State-of-the-Art Open LLM | Databricks
The Best GPUs for Deep Learning in 2023 — An In-depth Analysis
Run an LLM Locally with LM Studio - KDnuggets
Document about LM Studio
Optimum
Optimum is an extension of Transformers that provides a set of performance optimization tools to train and run models on targeted hardware with maximum efficiency. It is also the repository of small, mini, tiny models.
google-research/bert: TensorFlow code and pre-trained models for BERT
BERT Transformers – How Do They Work? | Exxact Blog
Excellent document about BERT transformers / models and their parameters: - L=number of layers. - H=size of the hidden layer = number of vectors for each word in the sentence. - A = Number of self-attention heads - Total parameters.
google/bert_uncased_L-4_H-256_A-4 · Hugging Face
Repository of all Bert models, including small. Start using this model for testing.
Generative pre-trained transformer - Wikipedia
Open LLM Leaderboard - a Hugging Face Space by HuggingFaceH4
Comparison of efficiency of all LLM models on hugging face
6 Ways to Run LLMs Locally (also how to use HuggingFace)
Various methods to run LLM models locally hugging face is only one of them.
(1) Most cost effective GPU for local LLMs? : LocalLLaMA
GGML quantized models. They would let you leverage CPU and system RAM, instead of having to rely on a GPU’s. This could save you a fortune, especially if go for some used AMD Epyc platforms. This could be more viable for the larger models, especially the 30B/65B parameters models which would still press or exceed the VRAM on the P40.
Optimizing LLMs for Speed and Memory
7 Steps to Mastering Large Language Models (LLMs) - KDnuggets
A Step-by-Step Guide to Training Your Own Large Language Models (LLMs). | by Sanjay Singh | GoPenAI
7 steps to master large language models (LLMs) | Data Science Dojo
LLM for a new language : MachineLearning
High level how to train a model
Up to date List of LLM Models
(2) Are there any tiny (1-3b) models finetuned for coding available in GGUF format? : LocalLLaMA
bigcode (BigCode)
Research community developing various code models, small and big. Models may not be instruct
WizardLM (WizardLM)
deepseek-ai (DeepSeek)
They have the 1.3B version!!! This may be the best to start with Newspeak. Should work train even on huggingcface
deepseek-ai/deepseek-coder-6.7b-instruct · Hugging Face
Another possible model. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.
LLaMA 7B GPU Memory Requirement - Transformers - Hugging Face Forums
With the optimizers of bitsandbytes (like 8 bit AdamW), you would need 2 bytes per parameter, or 14 GB of GPU memory.
stabilityai/stable-code-3b · Hugging Face
Another potential model to use for Newspeak, but it is NOT open source. Adventage: 2.5B params, so should be usable in small GPUs
Large Language Models for Domain-Specific Language Generation: How to Train Your Dragon | by Andreas Mülder | Medium
training a model like Llama with 2.7 billion parameters outperformed a larger model like Vicuna with 13 billion parameters. Especially when considering resource consumption, this might be a good alternative to using a 7B Foundation model instead of a full-blown ChatGPT. The best price-to-performance base model for our use case turned out to be Mistral 7b. The model is compact enough to fit into an affordable GPU with 24GB VRAM and outperforms the other models with 7B parameters.
Can Ai Code Results - a Hugging Face Space by mike-ravkine
Comparison of LLM models for coding
openchat/openchat-3.5-0106 · Hugging Face
Open source with lots of information. Uses Multiple undrelying models. Not sure how I would train for it
Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face
The Mixtral model is new, and seems to be good. Click on “Demo“ to test it
StarCoder: A State-of-the-Art LLM for Code
Article has comparison with other code-LLM models
huybery/Awesome-Code-LLM: An awesome and curated list of best code-LLM for research.
Hannibal046/Awesome-LLM: Awesome-LLM: a curated list of Large Language Model
Large language model - Wikipedia
List of LLM models on Wikipedia
Fine-tune a pretrained model
Use the Bert model to train on Yelp dataset
Replit — How to train your own Large Language Models
Hi level only talk about training for a language
How to train a new language model from scratch using Transformers and Tokenizers
Describes how to train a new language (desperanto) model.