Actuarial Data Science
Open Learning Resource (version 1.0, 2026)
Overview
This is an Open Learning Resource for Actuarial Data Science. It covers an end-to-end problem solving process with data science techniques to tackle various data problems in a business context.
Lectures
- Chapter 1: Introduction
- Chapter 2: Problem Statement
- Chapter 3: Data
- Chapter 4: (Predictive and Inferential) Modelling
- Chapter 5: Evaluation
- Chapter 6: Communication
- Chapter 7: Ethics
Labs
- Chapter 1 & 2: Introduction & Problem Statement
- Chapter 3: Data
- Chapter 4: Modelling
- Chapter 5: Evaluation
Industry case competition: Actuarial Datathon
The Actuarial Datathon is an industry case competition embedded in the UNSW course Actuarial Data Science Applications (ACTL4305 / ACTL5305), which sits on the Actuaries Institute Part II Data Science Principles pathway. It is designed so that a single, coherent business challenge runs through the term while several partners contribute different lenses—data and domain context, consulting practice, and entrepreneurship.
Industry partners have included insurers (providing business context and data for industry case challenges), consulting firms (providng thought leadership and guest teaching on insurance data science in practice), and venture capital firms (providing entrepreneurial perspective and hosting of the final pitch). Together, this mirrors how real projects sit at the intersection of data science, communication, innovation, and commercial decision-making.
How it connects to this open resource. You can treat the Datathon as the “spine” of the end-to-end workflow the materials teach:
| Course stage | What students practise in the challenge | Where to study it here |
|---|---|---|
| Problem framing & strategy | Defining the pricing problem, constraints, and competitive setting | Problem statement, Introduction |
| Data | Working with internal and external data, quality checks, features | Visualisation, Manipulation, Cleaning |
| Modelling | Interpretable risk pricing, segmentation, model choice | Modelling topics and specialised chapters (e.g. GLM, GBM) |
| Evaluation | Comparing approaches and defending choices under uncertainty | Evaluation |
| Communication & ethics | Storyline for stakeholders, responsible use of models and data | Communication, Ethics |
In the live course, teams often operate like start-ups: building models, testing strategies in a simulated market (many notional competitors and customers), and presenting to an industry panel.
Further reading and media.
The table below lists industry-integrated projects from ACTL4305/5305 (Actuarial Data Science Applications, 2020–present) and ACTL3142/5110(Statistical Machine Learning for Risk and Insurance Applications, 2021–2022), run through the Actuarial Datathon or Sandbox projects.
| Course | Year | Project | Partner(s) | Links |
|---|---|---|---|---|
| ACTL4305/5305 | 2025 | Datathon: Travel insurance conversion — insights and growth opportunities | Freely, Zurich Cover-More, Taylor Fry | Event video |
| ACTL4305/5305 | 2024 | Datathon: Competing for pet insurance customers — a pricing competition | Fetch, AirTree, Finity | Event video · When Students Become Startups (Actuaries Digital) |
| ACTL4305/5305 | 2023 | Sandbox: Understanding bushfire event risk across Australia | Suncorp | Event video |
| ACTL4305/5305 | 2022 | Sandbox: Multi-coverage claim modelling for insurance packaged products | IAG | Event video |
| ACTL4305/5305 | 2021 | Sandbox: Pricing models for SME building insurance with high-cardinality features | Suncorp | Event video |
| ACTL3142/5110 | 2022 | Sandbox: Predicting claims inflation for commercial auto insurance pricing | IAG | Event video |
AI-assisted music: Teaching in Tune
Teaching in Tune on SoundCloud — Exploring AI-assisted learning through songs.
Play on this page (SoundCloud widget — use the controls below, or open the set on SoundCloud if the player does not load, e.g. due to browser privacy settings).
Teaching in Tune is an AI-assisted playlist that turns quantitative ideas from machine learning into songs that are memorable, human, and fun to learn from. Tracks tie to ACTL4305/5305 (Actuarial Data Science Applications). Highlights include:
- From Data to Value — Course theme song. The core mission of actuarial data science: extracting value from data through careful analysis.
- Link It Right — an upbeat introduction to generalised linear models (GLMs).
- Don’t Overfit — a playful take on shrinkage (LASSO, ridge, elastic net) and the bias–variance trade-off.
- In the Forest of Trees (Random Forest) — the random forest machine learning technique from the modelling topics in the course.
Hear the full playlist on SoundCloud: Teaching in Tune.
License
This repository includes references from other open books, each subject to their respective licenses. All materials created by me are licensed under the Attribution 4.0 International (CC BY 4.0) license. For more details, please refer to the LICENSE file included in this repository.
How to cite
If you are using this textbook in your academic work, please find below an example for referencing it using the APA citation style.
Huang, F. (2026). Actuarial Data Science - Open Learning Resource. https://datascience.feihuang.org (GitHub Pages mirror: https://feihuangfh.github.io/actuarial_data_science_course; source: https://github.com/feihuangFH/actuarial_data_science_course).
Acknowledgement
- This online textbook is developed for teaching the course “Actuarial Data Science Applications” (coded as ACTL4305 and ACTL5305) at UNSW Sydney.
- I would like to express my gratitude to my tutors of the courses ACTL4305/5305, Yumo Dong, Xi Xin, and Salvatory Kessy, for their invaluable contributions in developing the lab materials. Special thanks to Xi Xin for his critical and comprehensive review of the materials.
- A special thanks to all the students who have taken this course from 2020 to the present, whose feedback and engagement have greatly enhanced the course over the years.
- I would also like to extend my appreciation to Dr. Patrick Laub for his assistance in building the course website using Quarto.
- The development of this course content has been informed by several key references, including but not limited to the following book references, as well as other sources noted throughout the course materials:
- Wickham, H., & Grolemund, G. (2016). R for data science: import, tidy, transform, visualize, and model data. O’Reilly Media, Inc.
- Peng, R. D., & Matsui, E. (2016). The Art of Data Science: A guide for anyone who works with Data. Skybrude consulting LLC.
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. New York: Springer (the Second Edition of this book published in 2021).
- Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media.
