Data And Beyond

Selected stories around Data Science, Machine Learning, Artificial Intelligence, Programming, and Technology topics. Writing guide: https://medium.com/data-and-beyond/how-to-write-for-data-and-beyond-b83ff0f3813e

Follow publication

The Components of LLMs.

The Immense Potential.

My Brandt
Data And Beyond
Published in
3 min readJun 12, 2024

--

#MyBrandt

The underlying technology of LLMs is called transformer neural network, simply referred to as a transformer.

Transformers have provided a significant leap in the capabilities of LLMs.

Without them the current generative AI revolution wouldn’t be possible.

Transformers are based on the same encoder-decoder architecture as recurrent and convolutional neural networks. Such a neural architecture aims to discover statistical relationships between tokens of text.

This is done through a combination of embedding techniques. Embeddings are the representations of tokens, such as sentences, paragraphs, or documents, in a high dimensional vector space, where each dimension corresponds to a learned feature or attribute of the language.

The embedding process takes place in the encoder.

Due to the huge size of LLMs, the creation of embedding takes extensive training and considerable resources.

What makes transformers different compared to previous neural networks is that the embedding process is highly parallelizable, enabling more efficient processing.

This is possible thanks to the attention mechanism.

Recurrent and convolutional neural networks make their word predictions based exclusively on previous words.

In this sense, they can be considered unidirectional.

The attention mechanism allows transformers to predict words bidirectionally, that is, based on both the previous and the following words. The goal of the attention layer, which is incorporated in both the encoder and the decoder, is to capture the contextual relationships existing between different words in the input sentence.

Training transformers involves two steps:

Pre Training and Fine Tuning.

In the phase of Pretraining, transformers are trained on large amounts of raw text data.

The Internet is the primary data source.

--

--

Data And Beyond
Data And Beyond

Published in Data And Beyond

Selected stories around Data Science, Machine Learning, Artificial Intelligence, Programming, and Technology topics. Writing guide: https://medium.com/data-and-beyond/how-to-write-for-data-and-beyond-b83ff0f3813e

My Brandt
My Brandt

Written by My Brandt

Founder of #Omimimo The Pure Water Game, #MyBrandt Est 1998, Decentralized Writer✒AI Ethicist,Blockchain Blocker, Web3 Enthusiast & Human Philosopher!

No responses yet

Write a response