September 17 2025

LLMs hold promise for the actuarial field

The rapid advancement of generative artificial intelligence has led to the development of large language models. ChatGPT and other AI chatbots are examples of tools powered by LLMs. These models specialize in natural language processing to understand and generate human language. Massive amounts of text data train LLMs to perform tasks such as mimicking human language, providing translations, summarizing, answering questions, generating text and even software coding.

Dale Hall

LLMs hold promise for many industries and professions, including the insurance industry and the actuarial field.

LLM use cases in insurance

In March, the Society of Actuaries Research Institute convened a panel of experts to discuss the use of generative AI in the insurance industry. The panel consisted of actuaries from a variety of practice areas, and they noted several LLM insurance applications such as:

Coding assistance: Code generation and automating documentation
Digital assistant: Email, document creation, note taking, meeting summarization
Data summarization and categorization: Claims data, submissions, notes; reinsurance treaties; medical underwriting files, calls and meetings
Testing and model validation assistance: Generating test cases, testing documentation, review and validation
Other applications: Translation, research source attribution, claims integration

The panel concluded that current AI tools, such as LLMs, can boost productivity for some tasks, but the technology hasn’t evolved enough to replicate actuarial analysis and decision-making. However, the panel predicted it will become necessary for actuaries to use these tools.

Implementing LLMs isn’t without challenges. The sensitive data insurance companies manage makes data privacy and security critical. Also, regulation compliance and ethical standards are needed to build trust with customers and stakeholders. So, incorporating LLMs into current systems demands thorough planning and teamwork among various departments within the organization.

Benchmarking and comparing models

After identifying tasks that might be suited for LLMs to complete or assist, consider the specific type that best meets the needs of a given task. There are four basic variants for use cases:

Foundational models: Have not been tuned for specific tasks
Instruct models: More fine-tuned, meant for task-oriented applications
Code models: Specialize in understanding and generating code
Multimodal models: Understand and generate text, images and audio

Opting for the largest and highest-performing LLM may not always be necessary or cost-effective. Other considerations include latency requirements, budgets, scalability, ethical and bias issues. Experimentation and evaluating the results are helpful in choosing the appropriate LLM.

Finding the right LLM for a specific task depends on these considerations:

Model size and computational requirements

Need	Requirement	Size
Simple tasks, quick responses	Less powerful hardware	Smaller model
Complex reasoning	More computational resources	Larger model

Task-specific performance
Context window size: The amount of text generated in a single interaction
Cost vs. performance

LLM benchmarks are assessment tools that compare strengths and limitations. There are categories of benchmarks, some of which are listed in the table below, along with the specific products that fall within that category and a description of how each of them works:

Benchmark category	Benchmark product	Description
Knowledge and Recall	Massive Multitask Language Understanding	Uses about 16,000 multiple-choice questions across a range of topics, from mathematics to law.
Knowledge and Recall	Google-Proof Question and Answering	448 multiple-choice questions written by experts in biology, physics and chemistry. Tests a model’s expert-level knowledge.
Mathematics	Mathematics Aptitude Test of Heuristics	12,500 problems from mathematics competitions, covering a range of difficulty levels and math topics. Requires LLMs to demonstrate their reasoning.
Coding	HumanEval	Assesses an LLM’s code-writing capabilities. Consists of 164 programming problems that the LLM is required to synthesize.
Reading Comprehension	Discrete Reasoning Over the Content of Paragraphs (DROP)	A Q&A dataset that assesses an LLM’s ability to understand and extract information from inputs.

The best way to evaluate LLMs is to create a benchmark that is tailored to a specific task. This method not only gives a more accurate measure of performance for that task but also supports development and enables ongoing performance tracking.

Deploying an LLM

The easiest way to use an LLM is through an application programming interface from major developers, such as ChatGPT. While hosting an LLM independently offers more control, using an API is simpler, faster and cost-effective. It is important to ensure that the chosen provider meets security and privacy standards.

Launching an open LLM follows a similar process to other software deployments, although the details can differ depending on the specific LLM being used. There are a variety of deployment methods, from software for beginners to more robust solutions that are appropriate for production environments. As far as where to locate the LLM, the cloud offers a simpler solution compared to building an independent server.

Because deploying LLMs falls outside typical actuarial training and expertise, it is recommended to seek assistance from cloud engineers and software developers.

Assessing risk and maintaining governance

Actuaries are experts in risk management and governance and have extensive knowledge about technology and data. Their expertise and professional standards make it crucial that they have key roles in the responsible and ethical use of AI and LLMs.

Risk and ethics considerations are essential in choosing LLMs for responsible actuarial use. For example, it is important to find a provider who shares the organization’s viewpoint on ethical AI practices and that they feel comfortable with their AI governance structure.

Other provider considerations include:

Privacy and protection: Ensuring models and their providers meet privacy and data protection requirements.
Risk and compliance: Regularly reviewing LLM output to ensure it meets compliance requirements.
Technology and reliability: Ensuring the model has the necessary capabilities, performs consistently and offers sufficient technical support.
Bias, fairness and discrimination: Confirming the LLM addresses these risks.
Transparency and explainability: Documenting model specifications and how it is used, logging outputs, and detailing the development process.
Accountability and responsibility: Establishing clear lines of accountability and responsibility to oversee decision-making with LLM help.

Below are two important resources that provide a high-level overview of AI ethics:

UNESCO’s Recommendation on the Ethics of Artificial Intelligence
The National Association of Insurance Commissioners Principles on Artificial Intelligence, particularly relevant to the financial services and insurance sectors.

SOA resources to help actuaries leverage AI

The SOA Research Institute published a detailed guide on deploying LLMs for actuarial use, Operationalizing LLMs: A Guide for Actuaries, which provides more details and helpful tips. Additionally, SOA’s AI Research landing page provides a library of reports and resources, including the monthly Actuarial Intelligence Bulletin, which informs readers about advancements in actuarial technology and new AI research reports.

The post LLMs hold promise for the actuarial field appeared first on Insurance News | InsuranceNewsNet.

LLMs hold promise for the actuarial field

LLM use cases in insurance

Benchmarking and comparing models

Deploying an LLM

Assessing risk and maintaining governance

SOA resources to help actuaries leverage AI

Related Posts

AM Best Maintains Under Review With Developing Implications Status for Credit Ratings of Banner Life Insurance Company and William Penn Life Insurance Company of New York

PacLife settles with Washington state plaintiffs in IUL illustration suit

Vermont judge dismisses two National Life companies from IUL lawsuit