The OSCAR project (Open Super-large Crawled Aggregated coRpus) is an Open Source project aiming to provide web-based multilingual resources and datasets for Machine Learning (ML) and Artificial Intelligence (AI) applications.

Newspeak-test-dataset

[https://www.kaggle.com/datasets/mzimmerm/newspeak-test-dataset?select=NewspeakGrammar.ns] - 2024-03-04 11:54:57 - public:mzimmerm

ai, dataset, kaggle, newspeak - 4 | id:1489791 -

Dataset is just a zip of files

Introduction to Constructing Your Dataset | Machine Learning | Google for Developers

[https://developers.google.com/machine-learning/data-prep/construct/construct-intro] - 2024-03-04 11:38:48 - public:mzimmerm

ai, dataset, train - 3 | id:1489790 -

(2) Are there any tiny (1-3b) models finetuned for coding available in GGUF format? : LocalLLaMA

[https://www.reddit.com/r/LocalLLaMA/comments/16csdq6/are_there_any_tiny_13b_models_finetuned_for/] - 2024-03-04 10:56:19 - public:mzimmerm

ai, code, generate, llm, model, newspeak, small - 7 | id:1489789 -

bigcode (BigCode)

[https://huggingface.co/bigcode] - 2024-03-04 10:50:02 - public:mzimmerm

ai, code, generate, huggingface, llm, model, newspeak, santacoder, small, starcoder - 10 | id:1489788 -

Research community developing various code models, small and big. Models may not be instruct

WizardLM (WizardLM)

[https://huggingface.co/WizardLM] - 2024-03-04 10:42:44 - public:mzimmerm

ai, code, generate, huggingface, llm, model, newspeak, small, wizardcoder - 9 | id:1489787 -

Another open source small (1B) model.

deepseek-ai (DeepSeek)

[https://huggingface.co/deepseek-ai] - 2024-03-04 10:24:32 - public:mzimmerm

ai, best, code, deepseek, good, huggingface, instruct, llm, model, newspeak, small - 11 | id:1489786 -

They have the 1.3B version!!! This may be the best to start with Newspeak. Should work train even on huggingcface

DeepSeek

[https://chat.deepseek.com/] - 2024-03-04 10:17:09 - public:mzimmerm

ai, chat, code, deepseek, generate, text - 6 | id:1489785 -

Chat uses model geared for coding.

deepseek-ai/deepseek-coder-6.7b-instruct · Hugging Face

[https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct] - 2024-03-04 10:13:20 - public:mzimmerm

ai, code, generate, good, llm, model, newspeak, opensource - 8 | id:1489783 -

Another possible model. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.

LLaMA 7B GPU Memory Requirement - Transformers - Hugging Face Forums

[https://discuss.huggingface.co/t/llama-7b-gpu-memory-requirement/34323/6] - 2024-03-04 10:10:38 - public:mzimmerm

ai, code, generate, llama, llm, model, newspeak, train - 8 | id:1489782 -

With the optimizers of bitsandbytes (like 8 bit AdamW), you would need 2 bytes per parameter, or 14 GB of GPU memory.

stabilityai/stable-code-3b · Hugging Face

[https://huggingface.co/stabilityai/stable-code-3b] - 2024-03-04 10:05:36 - public:mzimmerm

ai, code, generate, llm, model, newspeak - 6 | id:1489781 -

Another potential model to use for Newspeak, but it is NOT open source. Adventage: 2.5B params, so should be usable in small GPUs

Large Language Models for Domain-Specific Language Generation: How to Train Your Dragon | by Andreas Mülder | Medium

[https://medium.com/@andreasmuelder/large-language-models-for-domain-specific-language-generation-how-to-train-your-dragon-0b5360e8ed76] - 2024-03-04 09:45:59 - public:mzimmerm

ai, article, code, doc, generate, llm, train - 7 | id:1489780 -

training a model like Llama with 2.7 billion parameters outperformed a larger model like Vicuna with 13 billion parameters. Especially when considering resource consumption, this might be a good alternative to using a 7B Foundation model instead of a full-blown ChatGPT. The best price-to-performance base model for our use case turned out to be Mistral 7b. The model is compact enough to fit into an affordable GPU with 24GB VRAM and outperforms the other models with 7B parameters.

Can Ai Code Results - a Hugging Face Space by mike-ravkine

[https://huggingface.co/spaces/mike-ravkine/can-ai-code-results] - 2024-03-04 09:38:45 - public:mzimmerm

ai, code, generate, huggingface, llm, model, summary - 7 | id:1489779 -

Comparison of LLM models for coding

(26) Discord

[https://discord.com/channels/1130134702557249637/@home] - 2024-03-04 09:16:19 - public:mzimmerm

ai, discord, openchat - 3 | id:1489777 -

Openchat Chatbot UI

[https://openchat.team/] - 2024-03-04 08:43:11 - public:mzimmerm

ai, chat, openchat, text - 4 | id:1489776 -

Online UI to Openchat. This seems really good, open source etc. It uses the LLama2 and Mistral models, according to https://github.com/imoneoi/openchat

openchat/openchat-3.5-0106 · Hugging Face

[https://huggingface.co/openchat/openchat-3.5-0106] - 2024-03-04 08:41:50 - public:mzimmerm

ai, code, generate, huggingface, llm, model, openchat - 7 | id:1489775 -

Open source with lots of information. Uses Multiple undrelying models. Not sure how I would train for it

Welcome Mixtral - a SOTA Mixture of Experts on Hugging Face

[https://huggingface.co/blog/mixtral] - 2024-03-04 08:24:33 - public:mzimmerm

ai, code, generate, huggingface, llm, mixtral, model, newspeak - 8 | id:1489774 -

The Mixtral model is new, and seems to be good. Click on “Demo“ to test it

StarCoder: A State-of-the-Art LLM for Code

[https://huggingface.co/blog/starcoder] - 2024-03-04 07:43:17 - public:mzimmerm

ai, code, generate, good, huggingface, llm, model, newspeak - 8 | id:1489773 -

Article has comparison with other code-LLM models

huybery/Awesome-Code-LLM: An awesome and curated list of best code-LLM for research.

[https://github.com/huybery/Awesome-Code-LLM] - 2024-03-04 07:33:15 - public:mzimmerm

ai, code, generate, list, llm, model - 6 | id:1489772 -

Hannibal046/Awesome-LLM: Awesome-LLM: a curated list of Large Language Model

[https://github.com/Hannibal046/Awesome-LLM] - 2024-03-04 07:31:48 - public:mzimmerm

ai, list, llm, model - 4 | id:1489771 -

Includes code generation models

Large language models and the rise of the AI code generators | InfoWorld

[https://www.infoworld.com/article/3696970/llms-and-the-rise-of-the-ai-code-generators.html] - 2024-03-04 07:14:23 - public:mzimmerm

ai, code, generate, language, model, program, review - 7 | id:1489770 -

Review of LLM specialized for code generation

Large language model - Wikipedia

[https://en.wikipedia.org/wiki/Large_language_model#List] - 2024-03-04 07:08:48 - public:mzimmerm

ai, license, list, llm, model - 5 | id:1489769 -

List of LLM models on Wikipedia

List of datasets for machine-learning research - Wikipedia

[https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research] - 2024-03-04 07:01:06 - public:mzimmerm

ai, dataset - 2 | id:1489768 -

stabilityai (Stability AI) - Stable Diffusion running on Huggingface

[https://huggingface.co/stabilityai] - 2024-03-04 06:24:17 - public:mzimmerm

ai, chat, good, home, huggingface, image, instruct, model, newspeak, small, stabilityai, stablecode - 12 | id:1489767 -

Chat, models. Not open source, but instruct and relatively small (3B). The 3B instruct may be the best to try on Newspeak.

ChatGPT

[https://chat.openai.com/] - 2024-03-04 05:39:20 - public:mzimmerm

ai, chat, image, text - 4 | id:1489764 -

Main page of ChatGPT

Le Chat Mistral

[https://chat.mistral.ai/chat] - 2024-03-04 05:03:54 - public:mzimmerm

ai, chat, image, mistral, text - 5 | id:1489762 -

Chat on Mistral. Does well on Python and Smalltalk

Mistral AI - Wikipedia

[https://en.wikipedia.org/wiki/Mistral_AI] - 2024-03-04 04:59:05 - public:mzimmerm

ai, company - 2 | id:1489761 -

Their models are open source.

fast.ai – fast.ai—Making neural nets uncool again

[https://www.fast.ai/] - 2024-03-04 04:42:25 - public:mzimmerm

ai, company - 2 | id:1489760 -

OpenAI Codex - Wikipedia

[https://en.wikipedia.org/wiki/OpenAI_Codex] - 2024-03-04 04:38:12 - public:mzimmerm

ai, code, codex, generate, language, model, program - 7 | id:1489759 -

Model which generates code for Python, Javascript, Go, Shell, Perl, Swifg, Ruby, PHP

Nečekaný úspěch baterie, která by mohla být stejně dobrá jako benzín - Seznam Zprávy

[https://www.seznamzpravy.cz/clanek/tech-necekany-uspech-baterie-ktera-by-mohla-byt-stejne-dobra-jako-benzin-246918] - 2024-03-03 22:10:42 - public:mzimmerm

battery, car - 2 | id:1489758 -

HuggingChat

[https://huggingface.co/chat/] - 2024-03-03 08:54:48 - public:mzimmerm

ai, chat, huggingface, text - 4 | id:1489751 -

Chat with Huggingface's chat engine

codellama (Code Llama) - Huggingface model for generating programs. Maybe can be used for Newspeak?

[https://huggingface.co/codellama] - 2024-03-03 08:48:06 - public:mzimmerm

ai, code, generate, huggingface, language, llama, model, newspeak, program - 9 | id:1489750 -

Gemini

[https://gemini.google.com/app] - 2024-03-03 08:42:18 - public:mzimmerm

ai, chat, gemini, google, image, language, text - 7 | id:1489749 -

Gemini chat from Google. Can generate Python and other other code.

Introducing Gemini: Google’s most capable AI model yet

[https://blog.google/technology/ai/google-gemini-ai/#capabilities] - 2024-03-03 08:29:52 - public:mzimmerm

ai, alphacode, code, gemini, google, program, write - 7 | id:1489746 -

Advanced coding Our first version of Gemini can understand, explain and generate high-quality code in the world’s most popular programming languages, like Python, Java, C++, and Go. Using a specialized version of Gemini, we created a more advanced code generation system, AlphaCode 2,

AI Code Tools: The Ultimate Guide in 2024

[https://codesubmit.io/blog/ai-code-tools/] - 2024-03-03 08:19:57 - public:mzimmerm

ai, code, generate, good, model, tool - 6 | id:1489745 -

AI Code tools : Good summary. Does not talk about which pre-trained model they use. One is gemini (bard) -> alphacode2

Getting Started w/BERT.ipynb - Colaboratory

[https://colab.research.google.com/drive/1YtTqwkwaqV2n56NC8xerflt95Cjyd4NE?usp=sharing] - 2024-03-03 07:30:53 - public:mzimmerm

ai, bert, course, jupiter, notebook - 5 | id:1489743 -

Jupyter notebook to test Bert

Introduction - Hugging Face NLP Course

[https://huggingface.co/learn/nlp-course/chapter1/1] - 2024-03-03 07:10:04 - public:mzimmerm

ai, course, good, nlp, todo, tutorial - 6 | id:1489742 -

Natural Languge processing - full course.

BERT 101 - State Of The Art NLP Model Explained

[https://huggingface.co/blog/bert-101] - 2024-03-03 06:50:18 - public:mzimmerm

ai, bert, best, good, model, progress, summary, transform - 8 | id:1489741 -

Best summary of Natural Language Processing and terms - model (a language model - e.g. BertModel, defines encoder and decoder and their properties), transformer (a specific neural network based on attention paper), encoder (series of transformers on input), decoders (series of transformers on output). Bert does NOT use decoder. TensorFlow and PyTorch are possible backends to Transformers (NN). Summary: BERT is a highly complex and advanced language model that helps people automate language understanding.

BERT vs GPT: A Tale of Two Transformers That Revolutionized NLP | by Tavva Prudhvith | Medium

[https://medium.com/@prudhvithtavva/bert-vs-gpt-a-tale-of-two-transformers-that-revolutionized-nlp-11fff8e61984] - 2024-03-03 06:41:37 - public:mzimmerm

ai, bert, good, gpt, model, transform - 6 | id:1489740 -

google-research/bert: TensorFlow code and pre-trained models for BERT

[https://github.com/google-research/bert] - 2024-03-03 06:35:49 - public:mzimmerm

ai, bert, train - 3 | id:1489739 -

Methods and tools for efficient training on a single GPU

[https://huggingface.co/docs/transformers/en/perf_train_gpu_one] - 2024-03-03 05:06:47 - public:mzimmerm

ai, card, memory, video - 4 | id:1489738 -

Simple Machine Learning Model in Python in 5 lines of code | by Raman Sah | Towards Data Science

[https://towardsdatascience.com/simple-machine-learning-model-in-python-in-5-lines-of-code-fe03d72e78c6] - 2024-03-02 11:26:11 - public:mzimmerm

ai, example, simple, todo, train - 5 | id:1489732 -

Yelp Review Classification. Using Embedding, CNN and LSTM | by Zhiwei Zhang | Medium

[https://medium.com/@zhiwei_zhang/yelp-review-classification-b2816d990429] - 2024-03-02 10:55:13 - public:mzimmerm

ai, doc, train, yelp - 4 | id:1489731 -

Simpliest start with ai. Use the Github code linked in

Fine-tune a pretrained model

[https://huggingface.co/docs/transformers/training] - 2024-03-02 10:39:40 - public:mzimmerm

ai, bert, code, example, good, huggingface, llm, notebook, progress, train, train-bert-on-yelp, tutorial - 12 | id:1489730 -

Use the Bert model to train on Yelp dataset

BigCode - Open and responsible development of LLMs for code

[https://www.bigcode-project.org/] - 2024-03-02 10:21:57 - public:mzimmerm

account, ai, computer, language, model, train - 6 | id:1489729 -

BigCode is an open scientific collaboration working on the responsible development and use of large language models for code

Replit — How to train your own Large Language Models

[https://blog.replit.com/llm-training] - 2024-03-02 10:18:28 - public:mzimmerm

ai, doc, language, llm, model, train - 6 | id:1489728 -

Hi level only talk about training for a language

Viewing mzimmerm's Bookmarks

7 steps to master large language models (LLMs) | Data Science Dojo

Home - Replit

LLM for a new language : MachineLearning

Up to date List of LLM Models

OSCAR dataset