Learning Outcomes:
This module will teach students to understand the inner workings and practical applications of large language models, as applied to generative and creative tasks. Students will be able to build their own working systems that exploit LLMs to generate novel content, and be able to use standard APIs for accessing very large LMs from third-party providers.
Indicative Module Content:
Languages, formal and natural;
Conventional NLP vs. the "new" NLP;
Statistical language models (n-gram and neural);
Tokenization (byte-pair encoding; greedy tokenization; token "healing");
Vector representations and latent spaces;
Encoders, decoders and auto-encoders;
Distributed representations;
Positional encoding;
Contrastive Language/Image Pre-Training (CLIP);
Transformer architectures;
T5 (Text-to-Text Transfer Transformer) models;
Attention and self-attention;
Prompts and continuations;
Temperature settings and creative outputs;
"Hallucinations" and creativity in LLMs;
Poking under the hood (why do these models work?);
Explainable AI and LLMs;
Alignment of LLMS (and the alignment "tax")
RLHF in LLMS (Reinforcement Learning with Human Feedback);
Using APIs for LLMs;
Prompt engineering;
Chain-of-Thought (CoT) and other control mechanisms for LLMs