COMP47980 Generative AI: Language Models

Academic Year 2024/2025

The module aims to deliver a practical understanding of the theory, development and use of large language models for generation of novel content in AI.

Students will learn about: languages, formal and natural; conventional NLP vs. the "new" NLP; statistical language models (n-gram and neural); vector representations and latent spaces; encoders, decoders and auto-encoders; distributed representations; transformer architectures; attention and self-attention; prompts and continuations; "hallucinations" and creativity in LMs; poking under the hood (why do these models work?); using APIs for LLMs.

Show/hide contentOpenClose All

Curricular information is subject to change

Learning Outcomes:

This module will teach students to understand the inner workings and practical applications of large language models, as applied to generative and creative tasks. Students will be able to build their own working systems that exploit LLMs to generate novel content, and be able to use standard APIs for accessing very large LMs from third-party providers.

Indicative Module Content:

Languages, formal and natural;
Conventional NLP vs. the "new" NLP;
Statistical language models (n-gram and neural);
Tokenization (byte-pair encoding; greedy tokenization; token "healing");
Vector representations and latent spaces;
Encoders, decoders and auto-encoders;
Distributed representations;
Positional encoding;
Contrastive Language/Image Pre-Training (CLIP);
Transformer architectures;
T5 (Text-to-Text Transfer Transformer) models;
Attention and self-attention;
Prompts and continuations;
Temperature settings and creative outputs;
"Hallucinations" and creativity in LLMs;
Poking under the hood (why do these models work?);
Explainable AI and LLMs;
Alignment of LLMS (and the alignment "tax")
RLHF in LLMS (Reinforcement Learning with Human Feedback);
Using APIs for LLMs;
Prompt engineering;
Chain-of-Thought (CoT) and other control mechanisms for LLMs

Student Effort Hours: 
Student Effort Type Hours
Lectures

24

Laboratories

24

Autonomous Student Learning

68

Total

116

Approaches to Teaching and Learning:
interactive lectures, with detailed slide decks and other content;
illustrative examples with real LLMs;
task-based learning with practical, hands-on exercises;
practical assignments and tasks;
domain-specific examples
 
Requirements, Exclusions and Recommendations

Not applicable to this module.


Module Requisites and Incompatibles
Not applicable to this module.
 
Assessment Strategy  
Description Timing Open Book Exam Component Scale Must Pass Component % of Final Grade
Assignment(Including Essay): A practical use of LLM technology with programming or exploratory development. n/a Alternative linear conversion grade scale 40% No

10

Individual Project: A substantial project to develop an intelligent system around LLM-based services. n/a Alternative linear conversion grade scale 40% No

60


Carry forward of passed components
No
 
Remediation Type Remediation Timing
In-Module Resit Prior to relevant Programme Exam Board
Please see Student Jargon Buster for more information about remediation types and timing. 
Feedback Strategy/Strategies

• Feedback individually to students, post-assessment

How will my Feedback be Delivered?

Not yet recorded.

Speech and Language Processing. Daniel Jurafsky & James H. Martin. See: https://web.stanford.edu/~jurafsky/slp3/