WK6: Bayesian Methods and Uncertainty in AI systems

Welcome to Week 6
Bayesian Methods and Uncertainty in AI systems
Module Lecturer: Dr Raghav Kovvuri
Email: raghav.kovvuri@ieg.ac.uk

1 / 19
next
Slide 1: Slide

This lesson contains 19 slide, with interactive quiz and text slide.

Items in this lesson

Welcome to Week 6
Bayesian Methods and Uncertainty in AI systems
Module Lecturer: Dr Raghav Kovvuri
Email: raghav.kovvuri@ieg.ac.uk

Slide 1 - Slide

Text
Text

Slide 2 - Slide

 Introduction
Probability in AI
  • A measure of how likely an event is to occur
            Scale: 0 (impossible) to 1 (certain)
Real-world analogy: Weather forecasting
  • "70% chance of rain" - more likely to rain than not, but not certain
Why it matters in AI:
  1. Handling uncertain data
  2. Making decisions with incomplete information
  3. Predicting outcomes

Slide 3 - Slide

Basic Probability Concepts
Types of events:
  • Independent: Outcome of one doesn't affect the other 
                 Example: Flipping a coin twice
  • Dependent: Outcome of one affects the other
                 Example: Drawing cards from a deck without                                             replacement
Probability rules:
  • Sum rule: P(A or B) = P(A) + P(B) - P(A and B)
  • Product rule: P(A and B) = P(A) × P(B) (if independent)



Real-world AI analogy: Spam filter considering multiple features of an email

Slide 4 - Slide

Formula: P(A|B) = P(B|A) × P(A) / P(B)
  • P(A|B): Probability of A given B has occurred (Posterior)
  • P(B|A): Probability of B given A (Likelihood)
  • P(A): Initial probability of A (Prior)
  • P(B): Probability of B occurring
Real-world analogy: Medical diagnosis
  • Prior: General prevalence of a disease
  • Likelihood: Accuracy of a medical test
  • Posterior: Probability of having the disease given a positive test


Bayes' Theorem
Introduction Bayes' Theorem: A way to update beliefs based on new evidence

Slide 5 - Slide

Bayes' Theorem in AI
Naive Bayes Classifier
Naive Bayes: A simple yet powerful classification algorithm
  • "Naive" because it assumes features are independent (often not true in reality)
Common applications:
  • Spam detection
  • Sentiment analysis
  • Document classification
Real-world analogy: Book categorization
Features: Words in the book
Classes: Genre (Mystery, Romance, Sci-Fi, etc.)
Naive assumption: Word occurrences are independent

Slide 6 - Slide

Uncertainty in AI Systems
Types of uncertainty:
  • Aleatory: Inherent randomness (e.g., rolling dice)
  • Epistemic: Uncertainty due to lack of knowledge
Why uncertainty matters in AI:
  1. Improves decision-making
  2. Helps in risk assessment
  3. Makes AI systems more robust

Real-world analogy: Self-driving cars
  • Aleatory uncertainty: Random pedestrian movements
  • Epistemic uncertainty: Limited sensor range or occlusions

Slide 7 - Slide

Bayesian Networks
Bayesian Network: A graphical model representing probabilistic relationships
Components:
  • Nodes: Variables
  • Edges: Dependencies between variables
  • Conditional Probability Tables (CPTs)
Real-world analogy: Detective's investigation board
  • Pieces of evidence connected by strings
  • Strength of connections represent probabilities
Applications in AI:
  1. Diagnostic systems
  2. Decision support tools
  3. Predictive modeling

Slide 8 - Slide

Text
Bayesian Network vs NN

Slide 9 - Slide

Bayesian Inference in ML
Bayesian Inference: Using Bayes' theorem to update beliefs based on data

Process:
  1. Start with prior beliefs
  2. Collect data
  3. Update beliefs (posterior)
Advantages in ML:
  • Handles uncertainty in predictions
  • Allows incorporation of prior knowledge
  • Provides probability distributions instead of point estimates

Real-world analogy: Weather forecasting model Historical weather patterns (Prior), Current meteorological measurements (Data), Updated forecast (Posterior)

Slide 10 - Slide

Bayesian Optimization in AI
Bayesian Optimization: A technique for optimizing expensive-to-evaluate functions
  • Key idea: Use probabilistic model to guide the search for optimal parameters
Process:



Build
Build a probabilistic model of the objective function
Evaluate
Use this model to determine next points to evaluate
Update
Update the model with new results
Repeat
Repeat until convergence or budget exhausted
Real-world analogy: Finding the best recipe
  1. Each ingredient combination is expensive (time-consuming to cook and taste)
  2. Use feedback from previous attempts to guide next attempts
  3. Gradually converge on the optimal recipe

Applications in AI:
  • Hyperparameter tuning in ML, AutoML, R&D

Slide 11 - Slide

Handling Uncertainty in DL
Sources of uncertainty in deep learning:
Model uncertainty: Uncertainty in model parameters
Data uncertainty: Noise or incompleteness in the training data
Techniques for uncertainty quantification:
  1. Monte Carlo Dropout
  2. Ensemble Methods
  3. Bayesian Neural Networks
Real-world analogy: Weather ensemble forecasting
  • Multiple models run with slightly different initial conditions
  • Spread of predictions indicates uncertainty

Slide 12 - Slide

In the context of Bayesian probability, what does the term "prior" refer to?

A
The initial probability estimate before considering new evidence
B
The final probability after considering all evidence
C
The probability of the evidence occurring
D
The difference between two probability estimates

Slide 13 - Quiz

Which of the following best describes the Naive Bayes assumption in classification tasks?
A
All features are equally important
B
The class is independent of the features
C
There must be an equal number of features and classes
D
Features are independent of each other given the class

Slide 14 - Quiz

In a Bayesian Network for a car starting problem, which of the following would most likely be a parent node to "Car Starts"?

A
Radio Works
B
Battery Charge
C
Car Color
D
Driver's Age

Slide 15 - Quiz

What is the primary advantage of using Bayesian Optimization in machine learning?
A
It always finds the global optimum
B
It requires no prior knowledge of the problem
C
It efficiently optimises expensive-to-evaluate functions
D
It guarantees the fastest convergence among all optimisation methods

Slide 16 - Quiz

In the context of uncertainty in AI systems, what does aleatory uncertainty refer to?
A
Uncertainty due to lack of knowledge
B
Uncertainty in the model parameters
C
Inherent randomness in the system
D
Uncertainty caused by measurement errors

Slide 17 - Quiz

Challenges
  • Scalability of Bayesian methods to big data
  • Integration with other AI techniques (e.g., reinforcement learning)
  • Explainable AI through Bayesian frameworks

Real-world analogy: Evolution of weather forecasting
  • From simple models to complex, probabilistic forecasts
  • Increased computational power enabling more sophisticated methods
  • Growing demand for explainable and reliable predictions

Slide 18 - Slide

Conclusion and Takeaway
Key takeaways:
  • Importance of probability in handling uncertainty in AI
  • Bayes' theorem as a foundation for updating beliefs
  • Applications of Bayesian methods in various AI domains
Practical applications recap:
  • Naive Bayes for classification
  • Bayesian networks for decision support
  • Bayesian optimization for hyperparameter tuning
  • Uncertainty quantification in deep learning

Slide 19 - Slide