Approximation of Aggregate Losses Using Simulation

Problem statement: The modeling of aggregate losses is one of the main objectives in actuarial theory and practice, especially in the pr ocess of making important business decisions regarding various aspects of insurance contracts. T he aggregate losses over a fixed time period is oft en modeled by mixing the distributions of loss frequen cy and severity, whereby the distribution resulted from this approach is called a compound distributio n. However, in many cases, realistic probability distributions for loss frequency and severity canno t be combined mathematically to derive the compound distribution of aggregate losses. Approach: This study aimed to approximate the aggregate loss distribution using simulation approach. In par ticular, the approximation of aggregate losses was based on a compound Poisson-Pareto distribution. Th e effects of deductible and policy limit on the individual loss as well as the aggregate losses wer e also investigated. Results: Based on the results, the approximation of compound Poisson-Pareto distributi on via simulation approach agreed with the theoretical mean and variance of each of the loss f requency, loss severity and aggregate losses. Conclusion: This study approximated the compound distribution of aggregate losses using simulation approach. The investigation on retained losses and insurance claims allowed an insured or a company to select an insurance contract that fulfills its r equirement. In particular, if a company wants to ha ve n additional risk reduction, it can compare alternati ve policies by considering the worthiness of the additional expected total cost which can be estimat ed via simulation approach.


INTRODUCTION
Let X 1 , X 2 …, X N denote the amount of loss of an insurance portfolio that recorded N losses over a fixed time period. If N is a random variable independent of X i , i = 1,2,…,N, which are identical and independently distributed (i.i.d.), then the aggregate losses is calculated as N i 1 i S X = = ∑ . Due to numerical difficulties, approximation methods of the exact cumulative distribution function (c.d.f.) of S, F S (s), have been suggested and tested namely the Fast Fourier Transform, inversion method, recursive method, Heckman-Meyers method and Panjer method (Heckman and Meyers, 1983; Von Chossy and Rappl, 1983;Pentikainen, 1977;Jensen, 1991). All of these approaches are based on the assumption that the distributions of loss frequency and severity are available separately. In other cases, due to incomplete information on separate frequency and severity distributions, only aggregate loss distribution is available for further statistical estimation and inference. Nevertheless, several researches on aggregate loss distributions have been carried out and such examples can be found in Dropkin (1964) and Bickerstaff (1972) who showed that the Lognormal distribution closely approximates certain types of homogeneous loss data, Pentikainen (1977) who improved the results of Normal approximation by suggesting Normal Power method, Seal (1977) who compared Normal Power method with Gamma approximation and concluded that the Gamma provides a generally better approximation, Venter (1983) who suggested transformed Gamma and transformed Beta distributions for approximating aggregate losses, Chaubey et al. (1998) who proposed Inverse Gaussian distribution for approximation of aggregate losses and Papush et al. (2001) who compare Gamma distribution with Normal and Lognormal distributions and concluded that the Gamma provides a better fit for aggregate losses.
Another approach that can be used for approximating aggregate losses is the method of simulation. The main advantage of simulation is that it can be performed for several mixtures of loss frequency and severity distributions, thus producing approximated results for the compound distribution of aggregate losses. In addition, the effects of deductible and policy limit on individual loss as well as aggregate losses can also be investigated and the simulation programming can be implemented in a fairly straightforward manner.

Collective risk model:
In a collective risk model, the c.d.f. of aggregate losses, S, can be calculated numerically as: denotes the common c.d.f. of each of X 1 , X 2 ,…, X N , P n = Pr[N = n] the probability mass function (p.m.f.) of N and *n x 1 n F (x) Pr (X ... X x) = = + + ≤ the n-fold convolutions of F x (.). The distribution of S resulted from this approach is called a compound distribution.
If X is discrete on 0,1,2,..., the k-th convolution of F x (.) is: Based on Eq. 1, a direct approach for calculating the c.d.f. of S requires the calculation of the n-fold convolutions of F x (.) implying that the computation of the aggregate loss distribution can be rather complicated. In many cases, realistic probability distributions for loss frequency and severity cannot be combined mathematically to derive the distribution of aggregate losses.
Simulation: Based on the actuarial literature, several methods can be applied for the approximation of aggregate loss distribution. For examples, the approximations of Normal and Gamma are fairly easy to be used since they are pre-programmed in several mathematical and statistical softwares. However, other distributions are not as convenient to be computed but may produce a more accurate approximation and one such instance is the Inverse Gaussian and the Gamma mixtures suggested by Chaubey et al. (1998).
This study aims to approximate the aggregate loss distribution using simulation approach. The following steps summarize the algorithm for such approach: • The distributions for loss frequency and severity are chosen, if possible, based on the analysis of historical data • Accordingly, the number of losses, N, is generated based on either Poisson, or Binomial, or Negative Binomial distributions • For i = 1 to i = N, the individual loss amount, X i , is generated based on Gamma, or Pareto, or Lognormal, or other positively skewed distributions • The aggregate losses, S, is obtained by using the sum of all X i 's • The same steps are then repeated to provide the estimate of the probability distribution of aggregate losses In addition to the approximation of aggregate losses, the effects of deductible and policy limit on individual loss amount can also be investigated using simulation approach. Let X denotes the random variable for individual amount of loss. When an insurer introduces a deductible policy, say at the value of d, the loss endured or retained by the insured can be represented by the random variable Y: whereas the loss covered or paid as claim by the insurer can be represented by the random variable Z: so that X = Y+Z. When an insurer introduces a policy limit in its coverage, say at the value of u, the loss retained by the insured can be represented by the random variable Y: whereas the loss paid as claim by the insurer can be represented by the random variable Z: Therefore, if an insurance contract contains deductible, d and policy limit, u, the loss retained by the insured can be represented by the random variable Y: whereas the loss paid as claim by the insurer can be represented by the random variable Z: The implementation of deductible and policy limit may not be limited to the basis of individual loss as they can also be extended to the basis of aggregate losses. The main advantage of simulation approach is that it allows the implementation of deductible and policy limit not only on individual loss but also on aggregate losses. Let denotes the random variable for aggregate losses. If an insurance contract contains both aggregate deductible, d * and aggregate policy limit, u * , the aggregate losses retained by the insured can be defined by the random variable V: whereas the aggregate losses paid as claims by the insurer can be represented by the random variable W: so that S = V+W. The following steps can be used for approximating the distributions of individual loss and aggregate losses of an insurance contract containing deductible and policy limit and also for an insurance contract containing aggregate deductible and policy limit: • The distributions for loss frequency and severity distributions are chosen, if possible, based on the analysis of historical data • Accordingly, the number of losses, N, is generated based on either Poisson, or binomial, or negative binomial distributions • For i = 1 to i = N, the individual loss amount, X i , is generated based on gamma, or Pareto, or lognormal, or other positively skewed distributions • The aggregate losses, S, is obtained by using the sum of all X i 's • For an insurance contract containing deductible and policy limit, the conditions specified by Eq. 7 and 8 are applied on all X i 's, i = 1,…,N, producing the retained loss, Y i , i = 1,…N and the loss paid as claim, Z i , i = 1,…,N • If deductible and policy limit on the basis of aggregate claims are to be included in the insurance contract, the conditions specified by Eq. 9 and 10 are applied also on all X i 's, i = 1,…,N, producing the retained loss, Y i , i = 1,…N and the loss paid as claim, Z i , i = 1,…,N • The same steps are then repeated to provide the approximated distributions of loss, X, retained loss, Y, claim, Z and aggregate losses, S

RESULTS
This section presents several results from the approximation of aggregate loss distribution via simulation approach based on a compound Poisson-Pareto distribution which is often a popular choice for modeling aggregate losses because of its desirable properties. Detailed discussions of the compound models and their applications in actuarial and insurance areas can be found in Klugman et al. (2006) and Panjer (1981).
In this example, the loss frequencies are generated from Poisson distribution whereas the loss severities are generated from Pareto distribution. The p.m.f., mean and k-th moment about zero for Poisson distribution are: whereas the probability density function (p.d.f.), mean and k-th moment about zero for Pareto distribution are: Where: α = Shape parameter β = Scale parameter The parameter of α in Pareto distribution determines the shape, with small values corresponding to a heavy right tail. The k-th moment of the distribution exists only if α>k.
The moments of the compound distribution of S can be obtained in terms of the moments of N and X. In particular, the mean and variance of S are: The frequency, severity and aggregate distributions based on 1,000 simulations are illustrated in Fig. 1-3. The simulated loss and aggregate losses are assumed to be in the currency of Ringgit Malaysia (RM). Figure 1 shows the frequency of loss assuming a Poisson distribution with an expected value of 20, i.e., λ = 20. Note that the distribution is bell shaped for an expected number of loss this large. The distribution would be more skewed for lower expected number of loss. Figure 2 shows the amount of loss or loss severity from a Pareto distribution with an expected severity of RM15,000 and a standard deviation of RM16,771, i.e., α = 10 and β = 135,000. Note that most losses are less than RM10,000, but some are much larger. Figure 3 shows the aggregate losses from a compound Poisson-Pareto distribution with λ = 20, α = 10 and β = 135,000. Note that the aggregate loss distribution is also positively skewed but not as skewed as the severity distribution and most aggregate losses are around RM200,000-RM350,000. Table 1-3 show the statistics summary of simulated aggregate losses based on 1,000 simulations of compound Poisson-Pareto distribution. In particular, we assume a fixed value for α and β and increase the value of λ in Table 1, in Table 2 we assume a fixed value for λ and β and increase the value of α and in Table 3 we assume a fixed value for λ and α and increase the value of β. In general, the results show that the mean and standard deviation of aggregate losses increase when the values of λ or β increase and the mean and standard deviation of aggregate losses decrease when the values of α increase. Thus, the simulated results of the compound Poisson-Pareto distribution agree with the mean and variance of each of the loss frequency, N, loss severity, X and aggregate losses, S, shown in Eq. 11-13.      Table 4 summarizes the results of 1,000 simulations for three alternative policies, assuming that the aggregate losses follows a compound Poisson-Pareto distribution with parameters λ = 30, α = 10, β = 135,000. As an example, the first policy provides a RM30,000 coverage or limit per loss above a RM10,000 retention or deductible for a premium of RM9,152. In other words, the insured retained the first RM10,000 of loss, whereas the insurer pays any amount of loss above RM10,000 up to the limit of RM30,000 (a total loss of RM40,000). The second policy is similar to the first, but with an increased deductible value. The third policy provides a RM60,000 aggregate coverage or limit above a RM40,000 aggregate retention or deductible for a premium of RM4,179. In other words, the insured retained the first RM40,000 of aggregate losses, whereas the insurer pays any amount of aggregate losses above RM40,000 up to the limit of RM60,000 (a total aggregate losses of RM100,000).
The calculation of premium is based on the assumption that the estimate of premium is equal to the expected value of claim costs obtained from simulation, plus a loading charge equivalent to the fixed expense of RM1,000 and the variable expense of 15% of expected claim cost, i.e., premium = 1.15E(Z)+1000, where Z is the random variable for claim or loss covered by the insurer.  The simulated retained losses for policy 1 and 3 are illustrated in Fig. 4 and 5. Note that the retained losses for deductible and limit per individual loss shown in Fig. 4 is more skewed compared to the retained losses for deductible and limit per aggregate basis shown in Fig. 5. In particular, most retained losses are less than RM10,000 for policy 1 whereas for policy 3, most retained losses are less than RM20,000.
Using the simulated distributions, Table 5 shows the mean of retained loss, E(Y), standard deviation of retained loss, Var(Y) and maximum probable value at 95% level of retained loss for each insurance alternative. The maximum probable value at ninetyfive percent level is equivalent to the ninety-fifth percentile of the distribution. In addition, the probability that the individual loss exceeds policy limit, Pr (X>d+u), is also provided. The mean total cost, which is defined as the mean of retained loss plus insurance premium, i.e., E(Y)+premium, is also displayed. The fourth policy has no deductible and limit and this policy can be used as a proxy for losses without any insurance coverage. From the insured's perspectives, based on the maximum probable value of retained loss at ninety-five percent level and the standard deviation of retained loss, the least risky strategy is policy 1 (RM10,000 retention per loss), followed by policy 2 (RM13,000 retention per loss), policy 3 (RM40,000 retention per aggregate losses) and no insurance coverage. However, the ranking for the least expensive strategy which is based on the premium is reversed, where most savings can be made on the premium provided by policy 3, followed by policy 2 and 1.
Based on this information, an insured, which can be represented by an individual or a company, can make an informed decision on which strategy to pursue. If the insured is deciding to have additional risk reduction made possible by the RM10,000 retention per loss policy compared to the RM40,000 retention per aggregate losses policy, the insured should consider whether the additional expected total cost of RM17171-RM16523 = RM648 is worthy of the additional risk reduction.

CONCLUSION
This study approximates the compound distribution of aggregate losses using a simulation approach. In particular, the approximation of aggregate losses is based on a compound Poisson-Pareto distribution which is often a popular choice for modeling aggregate losses because of its desirable properties. Based on the results, the approximation of a compound Poisson-Pareto distribution via simulation approach agree with the theoretical mean and variance of each of the loss frequency, N, loss severity, X and aggregate losses, S. In addition to the approximation of aggregate losses, the effects of deductible and policy limit per loss and the effects of deductible and policy limit per aggregate losses are investigated. The main advantage of the simulation approach is that it allows the distributions of individual loss, aggregate losses, retained loss, aggregate retained losses, individual claim and aggregate claims to be approximated by using a fairly straightforward simulation programming. The investigation on retained losses and insurance claims allows an insured or a company to select an insurance contract that fulfills its requirement. In particular, if a company wants to have an additional risk reduction, it can compare alternative policies by considering the worthiness of the additional expected total cost which can be estimated via simulation approach. It is also worth to note that different simulations can be run to examine the sensitivity of the results to different assumptions concerning the type and parameters of the assumed frequency and severity distributions. Finally, the simulation approach can be applied on any types of losses and disasters and may not be limited to the actuarial and insurance areas.