Actuarial Data Science

Open Learning Resource (version 1.0, 2026)

Author

Fei Huang, UNSW Sydney

Overview

This is an Open Learning Resource for Actuarial Data Science. It covers an end-to-end problem solving process with data science techniques to tackle various data problems in a business context.

Start here

Use the left sidebar to open any chapter, or jump to Lectures and Labs below; the Actuarial Datathon section comes next and explains how the industry challenge maps to these chapters. Each list item links to the HTML notes; many rows also include PDF or slide downloads.

Teaching in TuneAI-assisted songs that turn course ideas (GLMs, shrinkage, random forests, and more) into memorable tracks — is in the sidebar as Teaching in Tune, or jump straight to the music section for track notes and an embedded SoundCloud player on this page.

Lectures

Industry case competition: Actuarial Datathon

The Actuarial Datathon is an industry case competition embedded in the UNSW course Actuarial Data Science Applications (ACTL4305 / ACTL5305), which sits on the Actuaries Institute Part II Data Science Principles pathway. It is designed so that a single, coherent business challenge runs through the term while several partners contribute different lenses—data and domain context, consulting practice, and entrepreneurship.

Industry partners have included insurers (providing business context and data for industry case challenges), consulting firms (providng thought leadership and guest teaching on insurance data science in practice), and venture capital firms (providing entrepreneurial perspective and hosting of the final pitch). Together, this mirrors how real projects sit at the intersection of data science, communication, innovation, and commercial decision-making.

How it connects to this open resource. You can treat the Datathon as the “spine” of the end-to-end workflow the materials teach:

Course stage What students practise in the challenge Where to study it here
Problem framing & strategy Defining the pricing problem, constraints, and competitive setting Problem statement, Introduction
Data Working with internal and external data, quality checks, features Visualisation, Manipulation, Cleaning
Modelling Interpretable risk pricing, segmentation, model choice Modelling topics and specialised chapters (e.g. GLM, GBM)
Evaluation Comparing approaches and defending choices under uncertainty Evaluation
Communication & ethics Storyline for stakeholders, responsible use of models and data Communication, Ethics

In the live course, teams often operate like start-ups: building models, testing strategies in a simulated market (many notional competitors and customers), and presenting to an industry panel.

Further reading and media.

The table below lists industry-integrated projects from ACTL4305/5305 (Actuarial Data Science Applications, 2020–present) and ACTL3142/5110(Statistical Machine Learning for Risk and Insurance Applications, 2021–2022), run through the Actuarial Datathon or Sandbox projects.

Course Year Project Partner(s) Links
ACTL4305/5305 2025 Datathon: Travel insurance conversion — insights and growth opportunities Freely, Zurich Cover-More, Taylor Fry Event video
ACTL4305/5305 2024 Datathon: Competing for pet insurance customers — a pricing competition Fetch, AirTree, Finity Event video · When Students Become Startups (Actuaries Digital)
ACTL4305/5305 2023 Sandbox: Understanding bushfire event risk across Australia Suncorp Event video
ACTL4305/5305 2022 Sandbox: Multi-coverage claim modelling for insurance packaged products IAG Event video
ACTL4305/5305 2021 Sandbox: Pricing models for SME building insurance with high-cardinality features Suncorp Event video
ACTL3142/5110 2022 Sandbox: Predicting claims inflation for commercial auto insurance pricing IAG Event video
For UNSW students

Briefing packs, datasets, deadlines, and assessment weightings are published on your course Moodle page each term and may differ from earlier offerings. Use this textbook for methods; use Moodle for the authoritative task sheet.

AI-assisted music: Teaching in Tune

Teaching in Tune on SoundCloudExploring AI-assisted learning through songs.

Play on this page (SoundCloud widget — use the controls below, or open the set on SoundCloud if the player does not load, e.g. due to browser privacy settings).

Teaching in Tune is an AI-assisted playlist that turns quantitative ideas from machine learning into songs that are memorable, human, and fun to learn from. Tracks tie to ACTL4305/5305 (Actuarial Data Science Applications). Highlights include:

  • From Data to Value — Course theme song. The core mission of actuarial data science: extracting value from data through careful analysis.
  • Link It Right — an upbeat introduction to generalised linear models (GLMs).
  • Don’t Overfit — a playful take on shrinkage (LASSO, ridge, elastic net) and the bias–variance trade-off.
  • In the Forest of Trees (Random Forest) — the random forest machine learning technique from the modelling topics in the course.

Hear the full playlist on SoundCloud: Teaching in Tune.

About the Author

Dr. Fei Huang

School of Risk and Actuarial Studies, UNSW Business School

Email: feihuang@unsw.edu.au · Website: feihuang.org

Dr. Fei Huang is an Associate Professor (with tenure) in Risk and Actuarial Studies at UNSW Business School. She holds degrees from Xiamen University (BSc), the University of Hong Kong (MPhil), and the Australian National University (PhD). Her research sits at the intersection of responsible AI, insurance, and data-driven decision-making, with emphasis on insurance and retirement systems that stay fair, sustainable, and resilient amid technological and climate change. She draws on statistics, machine learning, economics, and actuarial science to develop approaches that are accurate, interpretable, and equitable.

Her work has been recognised with awards including the Australian Business Deans Council Award for Innovation and Excellence in Research, the Dean’s Award for Distinction, the North American Actuarial Journal Best Paper Award, and the Actuaries Institute Volunteer of the Year Award, and is supported by competitive funding such as Australian Research Council Discovery Projects and the National Industry PhD Program. She is a columnist for Actuaries Digital, works with industry and government on topics from fair pricing to longevity and climate resilience, and has advised regulators internationally—including as an invited expert at the New York State Assembly public hearing on AI in insurance. At UNSW she teaches actuarial data science and responsible AI and has received multiple teaching excellence awards.

License

This repository includes references from other open books, each subject to their respective licenses. All materials created by me are licensed under the Attribution 4.0 International (CC BY 4.0) license. For more details, please refer to the LICENSE file included in this repository.

How to cite

If you are using this textbook in your academic work, please find below an example for referencing it using the APA citation style.

Huang, F. (2026). Actuarial Data Science - Open Learning Resource. https://datascience.feihuang.org (GitHub Pages mirror: https://feihuangfh.github.io/actuarial_data_science_course; source: https://github.com/feihuangFH/actuarial_data_science_course).

Acknowledgement

  • This online textbook is developed for teaching the course “Actuarial Data Science Applications” (coded as ACTL4305 and ACTL5305) at UNSW Sydney.
  • I would like to express my gratitude to my tutors of the courses ACTL4305/5305, Yumo Dong, Xi Xin, and Salvatory Kessy, for their invaluable contributions in developing the lab materials. Special thanks to Xi Xin for his critical and comprehensive review of the materials.
  • A special thanks to all the students who have taken this course from 2020 to the present, whose feedback and engagement have greatly enhanced the course over the years.
  • I would also like to extend my appreciation to Dr. Patrick Laub for his assistance in building the course website using Quarto.
  • The development of this course content has been informed by several key references, including but not limited to the following book references, as well as other sources noted throughout the course materials:
    • Wickham, H., & Grolemund, G. (2016). R for data science: import, tidy, transform, visualize, and model data. O’Reilly Media, Inc.
    • Peng, R. D., & Matsui, E. (2016). The Art of Data Science: A guide for anyone who works with Data. Skybrude consulting LLC.
    • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. New York: Springer (the Second Edition of this book published in 2021).
    • Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media.