Search
Results
google-research/bert: TensorFlow code and pre-trained models for BERT
BERT Transformers – How Do They Work? | Exxact Blog
Excellent document about BERT transformers / models and their parameters: - L=number of layers. - H=size of the hidden layer = number of vectors for each word in the sentence. - A = Number of self-attention heads - Total parameters.
google/bert_uncased_L-4_H-256_A-4 · Hugging Face
Repository of all Bert models, including small. Start using this model for testing.
Training Bert on Yelp - Copy of training.ipynb - Colaboratory
Getting Started w/BERT.ipynb - Colaboratory
Jupyter notebook to test Bert
BERT 101 - State Of The Art NLP Model Explained
Best summary of Natural Language Processing and terms - model (a language model - e.g. BertModel, defines encoder and decoder and their properties), transformer (a specific neural network based on attention paper), encoder (series of transformers on input), decoders (series of transformers on output). Bert does NOT use decoder. TensorFlow and PyTorch are possible backends to Transformers (NN). Summary: BERT is a highly complex and advanced language model that helps people automate language understanding.
BERT vs GPT: A Tale of Two Transformers That Revolutionized NLP | by Tavva Prudhvith | Medium
google-research/bert: TensorFlow code and pre-trained models for BERT
Fine-tune a pretrained model
Use the Bert model to train on Yelp dataset