Generative AI Part 1(According to Oracle Gen AI Certification)
First all of all credits goes to oracle university team members,I learn alot from this certificate,First you must learn LLM basics.I will cover basics concepts for you,but this is not enough,Technology update day by day,Keep learning keep learning,this will kick starter for your learning journey,I hope you will learn something from this article,So let’s start.
Generative AI is a powerful subset of artificial intelligence that focuses on creating new content by learning from existing data. It has a wide range of applications, from text and image generation to music composition and virtual assistants. Techniques like GANs, VAEs, and transformers are commonly used in generative AI. While it offers many promising opportunities, it also presents challenges related to data quality, ethical concerns, and computational resources.
What topics discuss here:
1.Basic of LLMs(Large language model).
2.Prompting Techniques.
3.Traning and decoding.
4.Dangers of LLM based Technology Deployment.
5.Upcoming Cutting Edge Technologies.
glossary wordList-
AI-Artificial intelligence
LLM-Large Language Model.
GAN-Generative adversarial networks
- Basic of LLMs.
Large Language Models (LLMs) are a powerful tool for understanding and generating human language. They are trained on vast amounts of text data and use advanced neural network architectures to perform a wide range of natural language processing tasks. While LLMs offer many promising applications, they also present challenges related to data quality, bias, computational resources, and ethical concerns.
What is a Large Language Model-
A language model(LM) is a probabilistic model of Text.
Used to generate predict next word based on probabilty ,Answer can be dog or cat,others are less values,high value can be the an answer.
2.What is LLM Architectures-
Large Language Models (LLMs) often employ architectures that include encoders and decoders, which are fundamental components in many natural language processing (NLP) tasks. These architectures are designed to handle sequential data, such as text, and can be used for tasks like translation, summarization, and text generation. Here’s an overview of encoders and decoders in LLM architectures:
Encoders and Decoders-
Encoders
Purpose:
- Encoders are responsible for processing input sequences and converting them into a fixed-length vector representation, often referred to as the context vector or hidden state.
Function:
- The encoder reads the input sequence (e.g., a sentence or a paragraph) and captures its meaning and context.
- It processes the input data in a sequential manner, typically using recurrent neural networks (RNNs), long short-term memory networks (LSTMs), or transformers.
Key Features:
- Sequential Processing: Encoders process input data in a sequential manner, capturing dependencies and relationships between words or tokens.
- Context Representation: The final output of the encoder is a context vector that encapsulates the meaning and context of the entire input sequence.
Decoders
Purpose:
- Decoders are responsible for generating output sequences based on the context vector produced by the encoder.
Function:
- The decoder takes the context vector as input and generates the output sequence (e.g., a translated sentence or a summary) one token at a time.
- It uses the context vector to initialize its hidden state and then generates the output sequence in a sequential manner.
Key Features:
- Sequential Generation: Decoders generate output sequences one token at a time, using the context vector and previously generated tokens to predict the next token.
- Attention Mechanisms: Many decoders use attention mechanisms to focus on relevant parts of the input sequence while generating each token. This helps in capturing long-range dependencies and improving the quality of the generated output.
Combined Architectures
Seq2Seq (Sequence-to-Sequence) Models:
- Seq2Seq models consist of an encoder and a decoder. The encoder processes the input sequence and produces a context vector, which is then used by the decoder to generate the output sequence.
- These models are commonly used for tasks like machine translation, text summarization, and speech recognition.
Transformers:
- Transformers are a type of architecture that uses self-attention mechanisms to process input and output sequences. They consist of an encoder and a decoder, each made up of multiple layers of self-attention and feed-forward neural networks.
- The encoder processes the input sequence and produces a sequence of hidden states, which are then used by the decoder to generate the output sequence.
- Transformers have been highly successful in various NLP tasks due to their ability to capture long-range dependencies and parallelize computation.
Encoders Example-Bert
Decoders Example-GPT,LLAM,
both-Mistrel AI
Architetures at a glance:
3.Prompting and prompt Engineering-
Prompiting — The simplest way to affect the distribution over the vocabulary is to change the prompt.
Prompt-The text provided to an LLM as input,Sometimes containing instructions and or examples.
Prompt Engineering-The process of iteratively refining a prompt for the purpose of eliciting a particular style of response.
Prompt engineering is challenging,often unintutive and not guranteed to work.
At the same time,it can be effective,mutiple tested prompt-design strategies exist.
In context Learning and few Shot Prompting-
In context learning-Conditioning(prompting) an LLM with instructions and or demonstractions of the task it is meant to complete.
K-shot prompting-Explictly providing k examples of the intented task in the prompt.
Example Prompts,this is photos from OCI site.
Advanced Prompting Strategies.
Chain of Thought-prompt the LLM to emit intermediate reasoning steps.
Least to Most -Prompt the llm to decompose the problem and solve easy first.
Step-Back-Prompt the LLM to identify high level concepts pertinent to a specific task.
Issues with prompting-
01.Prompt injection(jailbreaking)-
02.Memorization-After answering,repeat the original prompt.
Training-
this process is very cost and high carbon emission process.
Decoding-
a process of generating text with an LLM-
this can be Greedy decoding-
Non-Deterministic Decoding-
Pick randomly among high probability candidates at each step.
Temperature-
Hallucination-
Groundedness and Attributability-
LLM Applications-
RAG(Retrieval Augmented Generation)
Code Modals-
Multi Modal-
Language Agents-
To be continued…