Search
Results
Who needs GitHub Copilot when you can roll your own AI code assistant at home • The Register
Honey, I shrunk the LLM! A beginner's guide to quantization • The Register
Perplexity
The Best GPUs for Deep Learning in 2023 — An In-depth Analysis
BERT Transformers – How Do They Work? | Exxact Blog
Excellent document about BERT transformers / models and their parameters: - L=number of layers. - H=size of the hidden layer = number of vectors for each word in the sentence. - A = Number of self-attention heads - Total parameters.
6 Ways to Run LLMs Locally (also how to use HuggingFace)
Various methods to run LLM models locally hugging face is only one of them.
deepseek-ai (DeepSeek)
They have the 1.3B version!!! This may be the best to start with Newspeak. Should work train even on huggingcface
deepseek-ai/deepseek-coder-6.7b-instruct · Hugging Face
Another possible model. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.
StarCoder: A State-of-the-Art LLM for Code
Article has comparison with other code-LLM models
stabilityai (Stability AI) - Stable Diffusion running on Huggingface
Chat, models. Not open source, but instruct and relatively small (3B). The 3B instruct may be the best to try on Newspeak.
AI Code Tools: The Ultimate Guide in 2024
AI Code tools : Good summary. Does not talk about which pre-trained model they use. One is gemini (bard) -> alphacode2
Introduction - Hugging Face NLP Course
Natural Languge processing - full course.
BERT 101 - State Of The Art NLP Model Explained
Best summary of Natural Language Processing and terms - model (a language model - e.g. BertModel, defines encoder and decoder and their properties), transformer (a specific neural network based on attention paper), encoder (series of transformers on input), decoders (series of transformers on output). Bert does NOT use decoder. TensorFlow and PyTorch are possible backends to Transformers (NN). Summary: BERT is a highly complex and advanced language model that helps people automate language understanding.
BERT vs GPT: A Tale of Two Transformers That Revolutionized NLP | by Tavva Prudhvith | Medium
Fine-tune a pretrained model
Use the Bert model to train on Yelp dataset
How to train a new language model from scratch using Transformers and Tokenizers
Describes how to train a new language (desperanto) model.
Generative AI in a Nutshell - how to survive and thrive in the age of AI - YouTube
rabbit — keynote
Poe
BigCode - Playground - a Hugging Face Space by bigcode
Look for models that could be used in Newspeak