GPT in 60 lines of NumPy

[https://jaykmody.com/blog/gpt-from-scratch/] - 2024-04-05 16:47:45 - public:amedlock

ai, gpt - 2 | id:1490911 -

GPT in 500 lines of SQL

[https://explainextended.com/2023/12/31/happy-new-year-15/] - 2024-04-05 16:46:58 - public:amedlock

ai, gpt - 2 | id:1490910 -

10 Legal AI Tools for Legal Practices and Professionals in 2024

[https://clickup.com/blog/ai-tools-for-lawyers/] - 2024-04-03 00:36:15 - public:mzimmerm

ai, legal - 2 | id:1490842 -

What Kind of Mind Does ChatGPT Have? | The New Yorker 👻

[https://www.newyorker.com/science/annals-of-artificial-intelligence/what-kind-of-mind-does-chatgpt-have] - 2024-04-02 06:27:13 - public:xxx

ai, artificial, chatgpt, gpt, intelligence, learn, to_read - 7 | id:1490832 -

Resume Parser

[https://www.rchilli.com/blog/resume-parsing-101/] - 2024-04-01 18:31:35 - public:Annie

AI, CV parsing, HR, recruitment, resume parser, Technology - 6 | id:1490828 -

A resume parser extracts, analyzes, and organizes data from resumes to identify suitable candidates. This tool streamlines the recruitment process, minimizes errors, and saves time, thus enhancing recruiters' efficiency.

Introducing DBRX: A New State-of-the-Art Open LLM | Databricks

[https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm] - 2024-03-29 15:32:34 - public:xxx

ai, artificial, intelligence, llm, opensource - 5 | id:1490813 -

The Best GPUs for Deep Learning in 2023 — An In-depth Analysis

[https://timdettmers.com/2023/01/30/which-gpu-for-deep-learning/] - 2024-03-22 02:50:52 - public:mzimmerm

ai, good, gpu, learn, llm, todo, train - 7 | id:1490076 -

Run an LLM Locally with LM Studio - KDnuggets

[https://www.kdnuggets.com/run-an-llm-locally-with-lm-studio] - 2024-03-12 02:29:38 - public:mzimmerm

ai, doc, llm, lmstudio - 4 | id:1489896 -

Document about LM Studio

Optimum

[https://huggingface.co/docs/optimum/index] - 2024-03-11 19:44:39 - public:mzimmerm

ai, doc, huggingface, llm, model, optimum, repo, small, transformer - 9 | id:1489894 -

Optimum is an extension of Transformers that provides a set of performance optimization tools to train and run models on targeted hardware with maximum efficiency. It is also the repository of small, mini, tiny models.

google-research/bert: TensorFlow code and pre-trained models for BERT

[https://github.com/google-research/bert/] - 2024-03-11 04:44:09 - public:mzimmerm

ai, bert, github, home, llm, mini, model, tiny, transformer - 9 | id:1489883 -

BERT model home on github

BERT Transformers – How Do They Work? | Exxact Blog

[https://www.exxactcorp.com/blog/Deep-Learning/how-do-bert-transformers-work] - 2024-03-11 04:39:00 - public:mzimmerm

ai, bert, doc, good, llm, parameter, progress, todo, transformer - 9 | id:1489882 -

Excellent document about BERT transformers / models and their parameters: - L=number of layers. - H=size of the hidden layer = number of vectors for each word in the sentence. - A = Number of self-attention heads - Total parameters.

google/bert_uncased_L-4_H-256_A-4 · Hugging Face

[https://huggingface.co/google/bert_uncased_L-4_H-256_A-4] - 2024-03-11 04:19:21 - public:mzimmerm

ai, bert, huggingface, llm, model, parameter, small, todo - 8 | id:1489880 -

Repository of all Bert models, including small. Start using this model for testing.

Generative pre-trained transformer - Wikipedia

[https://en.wikipedia.org/wiki/Generative_pre-trained_transformer] - 2024-03-11 04:14:03 - public:mzimmerm

ai, doc, llm, todo - 4 | id:1489879 -

AMD Ryzen AI CPUs & Radeon 7000 GPUs Can Run Localized Chatbots Using LLMs Just Like NVIDIA's Chat With RTX

[https://wccftech.com/amd-ryzen-ai-cpus-radeon-7000-gpus-localized-chatbot-llms-like-nvidia-chat-with-rtx/] - 2024-03-11 03:27:50 - public:mzimmerm

ai, amd, apu, doc, gpu, install, lmstudio, rocm, software, studio, todo - 11 | id:1489878 -

LM Studio can be installed on Linux with APU or GPU (looks like it needs the AI CPU though??) and run LLM. Install on Laptop and test if it works.

What is Epoch in Machine Learning?| UNext | UNext

[https://u-next.com/blogs/machine-learning/epoch-in-machine-learning/] - 2024-03-09 23:27:13 - public:mzimmerm

ai, doc, epoch - 3 | id:1489872 -

Training and Validation Loss in Deep Learning | Baeldung on Computer Science

[https://www.baeldung.com/cs/training-validation-loss-deep-learning] - 2024-03-09 23:23:38 - public:mzimmerm

ai, doc, error, evaluate, loss, train, validate - 7 | id:1489871 -

A Step-by-Step Guide to Model Evaluation in Python | by Shreya Singh | Medium

[https://medium.com/@jscvcds/a-step-by-step-guide-to-model-evaluation-in-python-3a72dee92560] - 2024-03-09 07:22:53 - public:mzimmerm

ai, doc, evaluate, model, todo - 5 | id:1489866 -

Red Quill - AI generated sex stories

[https://www.redquill.net/] - 2024-03-08 19:14:28 - public:xxx

adults, ai, artificial, generator, intelligence, sex, sexy, story - 8 | id:1489857 -

Free Artificial Intelligence (AI) Courses Online with Certificates [2024]

[https://www.mygreatlearning.com/ai/free-courses] - 2024-03-07 19:26:50 - public:mzimmerm

ai, course, free, online - 4 | id:1489849 -

Our free, hands-on data science courses - milan.zimmermann@gmail.com - Gmail

[https://mail.google.com/mail/u/0/#inbox/FMfcgzGxRxLVHXWdZFksqSXGwSgFzwhQ] - 2024-03-07 19:22:48 - public:mzimmerm

ai, course, free, kaggle - 4 | id:1489848 -

SageMaker Studio Lab

[https://studiolab.sagemaker.aws/users/mzimmerm] - 2024-03-07 19:18:42 - public:mzimmerm

account, ai, lab, online, sagemarker, studio - 6 | id:1489847 -

My account on SageMaker studio. The give out 4 hours of GPU a day!

AMD Unveils Ryzen 8000G Series Processors: Zen 4 APUs For Desktop with Ryzen AI

[https://www.anandtech.com/show/21208/amd-unveils-ryzen-8000g-series-processors-zen-4-apus-for-desktop-with-ryzen-ai] - 2024-03-07 02:06:10 - public:mzimmerm

ai, amd, cpu - 3 | id:1489837 -

8000G is the APU series for AI

Solving Transformer by Hand: A Step-by-Step Math Example | by Fareed Khan | Level Up Coding

[https://levelup.gitconnected.com/understanding-transformers-from-start-to-end-a-step-by-step-math-example-16d4e64e6eb1] - 2024-03-06 00:46:01 - public:mzimmerm

ai, example, math, principle, todo, transformer - 6 | id:1489825 -

Doing what a transformer is doing, by hand

Kaggle: Your Home for Data Science

[https://www.kaggle.com/] - 2024-03-06 00:41:23 - public:mzimmerm

ai, company, kaggle, notebook - 4 | id:1489824 -

Kaggle is like huggingface. They can run notebooks, and give GPU power to notebooks

Statistical Foundations of Machine Learning | Kaggle

[https://www.kaggle.com/code/alexandrelemercier/statistical-foundations-of-machine-learning] - 2024-03-06 00:37:34 - public:mzimmerm

ai, course, kaggle, learn, machine, statistics, todo - 7 | id:1489823 -

Mini course of statistical foundations of ML

stabilityai (Stability AI)

[https://huggingface.co/stabilityai?utm_source=button&utm_medium=email&utm_campaign=welcome_non-commercial] - 2024-03-06 00:31:58 - public:mzimmerm

account, ai, company, huggingface, stabilityai - 5 | id:1489822 -

My account on Stability AI - it is just a link to huggingface

Open LLM Leaderboard - a Hugging Face Space by HuggingFaceH4

[https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard] - 2024-03-05 23:50:45 - public:mzimmerm

ai, compare, huggingface, llm, model - 5 | id:1489821 -

Comparison of efficiency of all LLM models on hugging face

6 Ways to Run LLMs Locally (also how to use HuggingFace)

[https://semaphoreci.com/blog/local-llm] - 2024-03-05 21:45:35 - public:mzimmerm

ai, good, huggingface, llm, local - 5 | id:1489820 -

Various methods to run LLM models locally hugging face is only one of them.

Training Bert on Yelp - Copy of training.ipynb - Colaboratory

[https://colab.research.google.com/drive/1FhwrZ05umMvj4cshnEMUOLxjD9ynvCy9#scrollTo=nCFiAJ55LcLt] - 2024-03-05 07:57:07 - public:mzimmerm

ai, bert, huggingface, model, notebook, progress, yelp - 7 | id:1489813 -

(1) Interesting cheap GPU option: Instinct Mi50 : LocalLLaMA

[https://www.reddit.com/r/LocalLLaMA/comments/1b5ie1t/interesting_cheap_gpu_option_instinct_mi50/] - 2024-03-05 04:02:23 - public:mzimmerm

ai, amd, card, video - 4 | id:1489811 -

AMD seems to sell these accelerators, which are like video cards.

Ditching CUDA for AMD ROCm for more accessible LLM training and inference. | by Rafael Manzano Masapanta | Medium

[https://medium.com/@rafaelmanzanom/ditching-cuda-for-amd-rocm-for-more-accessible-llm-inference-ryzen-apus-edition-92c3649f8f7d] - 2024-03-05 03:55:11 - public:mzimmerm

ai, amd, apu, compile, gfx902, install, pytorch, rocm - 8 | id:1489810 -

Train LLM on AMD APU. In this scenario, we’ll use an APU because most laptops with a Ryzen CPU include an iGPU; specifically, this post should work with iGPUs based on the “GCN 5.0” architecture, or “Vega” for friends. We’ll use an AMD Ryzen 2200G in this post, an entry-level processor equipped with 4C/4T and an integrated GPU.

(1) Most cost effective GPU for local LLMs? : LocalLLaMA

[https://www.reddit.com/r/LocalLLaMA/comments/12vxxze/most_cost_effective_gpu_for_local_llms/] - 2024-03-05 00:49:23 - public:mzimmerm

ai, doc, llm, model, optimize, perform - 6 | id:1489804 -

GGML quantized models. They would let you leverage CPU and system RAM, instead of having to rely on a GPU’s. This could save you a fortune, especially if go for some used AMD Epyc platforms. This could be more viable for the larger models, especially the 30B/65B parameters models which would still press or exceed the VRAM on the P40.

Optimizing LLMs for Speed and Memory

[https://huggingface.co/docs/transformers/v4.35.2/en/llm_tutorial_optimization] - 2024-03-05 00:46:21 - public:mzimmerm

ai, doc, huggingface, llm, model, optimize, perform - 7 | id:1489803 -

7 Steps to Mastering Large Language Models (LLMs) - KDnuggets

[https://www.kdnuggets.com/7-steps-to-mastering-large-language-models-llms] - 2024-03-04 19:35:57 - public:mzimmerm

ai, doc, highlevel, llm, train - 5 | id:1489799 -

A Step-by-Step Guide to Training Your Own Large Language Models (LLMs). | by Sanjay Singh | GoPenAI

[https://blog.gopenai.com/a-step-by-step-guide-to-training-your-own-llm-2d81ff810695] - 2024-03-04 19:34:25 - public:mzimmerm

ai, doc, highlevel, llm, train - 5 | id:1489798 -

GenAI Stack Exchange

[https://genai.stackexchange.com/] - 2024-03-04 19:32:40 - public:mzimmerm

account, ai, doc, forum, stack, stackexchange - 6 | id:1489797 -

7 steps to master large language models (LLMs) | Data Science Dojo

[https://datasciencedojo.com/blog/master-large-language-models/#] - 2024-03-04 19:25:57 - public:mzimmerm

ai, doc, highlevel, llm, model, train - 6 | id:1489796 -

LLM for a new language : MachineLearning

[https://www.reddit.com/r/MachineLearning/comments/12xu5ls/p_llm_for_a_new_language/] - 2024-03-04 19:15:48 - public:mzimmerm

ai, highlevel, llm, model, train - 5 | id:1489794 -

High level how to train a model

Up to date List of LLM Models

[https://docs.google.com/spreadsheets/d/1kT4or6b0Fedd-W_jMwYpb63e1ZR3aePczz3zlbJW-Y4/edit#gid=741531996] - 2024-03-04 19:13:58 - public:mzimmerm

ai, doc, list, llm, model - 5 | id:1489793 -

OSCAR dataset

[https://oscar-project.org/] - 2024-03-04 18:46:43 - public:mzimmerm

ai, dataset, opensource - 3 | id:1489792 -

The OSCAR project (Open Super-large Crawled Aggregated coRpus) is an Open Source project aiming to provide web-based multilingual resources and datasets for Machine Learning (ML) and Artificial Intelligence (AI) applications.

Newspeak-test-dataset

[https://www.kaggle.com/datasets/mzimmerm/newspeak-test-dataset?select=NewspeakGrammar.ns] - 2024-03-04 11:54:57 - public:mzimmerm

ai, dataset, kaggle, newspeak - 4 | id:1489791 -

Dataset is just a zip of files

Introduction to Constructing Your Dataset | Machine Learning | Google for Developers

[https://developers.google.com/machine-learning/data-prep/construct/construct-intro] - 2024-03-04 11:38:48 - public:mzimmerm

ai, dataset, train - 3 | id:1489790 -

(2) Are there any tiny (1-3b) models finetuned for coding available in GGUF format? : LocalLLaMA

[https://www.reddit.com/r/LocalLLaMA/comments/16csdq6/are_there_any_tiny_13b_models_finetuned_for/] - 2024-03-04 10:56:19 - public:mzimmerm

ai, code, generate, llm, model, newspeak, small - 7 | id:1489789 -

bigcode (BigCode)

[https://huggingface.co/bigcode] - 2024-03-04 10:50:02 - public:mzimmerm

ai, code, generate, huggingface, llm, model, newspeak, santacoder, small, starcoder - 10 | id:1489788 -

Research community developing various code models, small and big. Models may not be instruct

WizardLM (WizardLM)

[https://huggingface.co/WizardLM] - 2024-03-04 10:42:44 - public:mzimmerm

ai, code, generate, huggingface, llm, model, newspeak, small, wizardcoder - 9 | id:1489787 -

Another open source small (1B) model.

deepseek-ai (DeepSeek)

[https://huggingface.co/deepseek-ai] - 2024-03-04 10:24:32 - public:mzimmerm

ai, best, code, deepseek, good, huggingface, instruct, llm, model, newspeak, small - 11 | id:1489786 -

They have the 1.3B version!!! This may be the best to start with Newspeak. Should work train even on huggingcface

deepseek-ai/deepseek-coder-6.7b-instruct · Hugging Face

[https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct] - 2024-03-04 10:13:20 - public:mzimmerm

ai, code, generate, good, llm, model, newspeak, opensource - 8 | id:1489783 -

Another possible model. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.

LLaMA 7B GPU Memory Requirement - Transformers - Hugging Face Forums

[https://discuss.huggingface.co/t/llama-7b-gpu-memory-requirement/34323/6] - 2024-03-04 10:10:38 - public:mzimmerm

ai, code, generate, llama, llm, model, newspeak, train - 8 | id:1489782 -

With the optimizers of bitsandbytes (like 8 bit AdamW), you would need 2 bytes per parameter, or 14 GB of GPU memory.

stabilityai/stable-code-3b · Hugging Face

[https://huggingface.co/stabilityai/stable-code-3b] - 2024-03-04 10:05:36 - public:mzimmerm

ai, code, generate, llm, model, newspeak - 6 | id:1489781 -

Another potential model to use for Newspeak, but it is NOT open source. Adventage: 2.5B params, so should be usable in small GPUs

Large Language Models for Domain-Specific Language Generation: How to Train Your Dragon | by Andreas Mülder | Medium

[https://medium.com/@andreasmuelder/large-language-models-for-domain-specific-language-generation-how-to-train-your-dragon-0b5360e8ed76] - 2024-03-04 09:45:59 - public:mzimmerm

ai, article, code, doc, generate, llm, train - 7 | id:1489780 -

training a model like Llama with 2.7 billion parameters outperformed a larger model like Vicuna with 13 billion parameters. Especially when considering resource consumption, this might be a good alternative to using a 7B Foundation model instead of a full-blown ChatGPT. The best price-to-performance base model for our use case turned out to be Mistral 7b. The model is compact enough to fit into an affordable GPU with 24GB VRAM and outperforms the other models with 7B parameters.

Search

Results