Building an Oasis from A to Z

22/08/2023

IPE has been working for more than a year now to develop a fully operational model under OASIS LMF. The model we designed is for HAIL peril in France and we are already expanding it to Europe.

The following post is designed to cover technical matters around implementation of a model under OASIS LMF.

A CAT model is usually summarized using 4 main blocks :

Hazard : a dictionary of events for the given peril with an intensity either deterministic or stochastic at a given location (defined further)
Exposure : a set of insured contracts with details or sum insured, location, occupancy, construction etc…
Vulnerability : Distribution of losses which derived a Ground Up Loss from a given Exposure and intensity. Different sets of vulnerability curves are usually used depending on Occupancy, Construction and so called secondary modifiers.
Financial : A module deemed to compute Insured loss from Ground Up Loss given Insurance contract terms mainly deductible, limit, inuring reinsurance either facultative or treaty.

The reason we choose OASIS LMF is because they normalize/implement the following modules :

Exposure is normalized following the OED (Open Exposure Data). Therefore, the input data specifications are predefined with their pro and cons but it’s not model specific at least. Though not easy AIR and RMS exposure formats can be translated into OED directly.
Financial module is precoded therefore the whole module does not need to be recoded and optimized.
Hazard and Vulnerabilities are pre-defined formats but going custom, they allow some tweeks should you go custom on pre-analysis and key-server script

Hence developing a model under OASIS LMF allows you to “focus” to core modules of a CAT model which are Hazard and Vulnerabilities.

The main structure of a CAT model under OASIS LMF are as follow :

The blocks are synthetized as follow :

Hazard :
- Events : a dictionary of events with their ids which can be ordered using an occurrence file (meaning sequence is preordered)
- Footprint : it’s the biggest file containing all events with probabilities of intensity at each “geographical level”.
- Intensity : intensities are binned into classes of intensity for vulnerability curves for example all exposures suffering a Hail of a diameter 3 to 3.5 will have their losses assessed using the same vulnerability curve.
Vulnerability :
- Vulnerability_dict : a mapper from exposure to vulnerability curves functions
- Vulnerability : the details of the vulnerability functions

Easy shall we say ! But let’s dive deeper starting with the geocoding troubles !

The Geocoding troubles

Like any other CAT model exposure is flawly qualified in terms of geocoding meaning any kind of localization is acceptable whether with lat/lon, zip code, administrative levels or country level.

The Footprint you have is basically designed at one or more but not all levels… That’s when the geocoding level kicks in as portfolios comes with mixed resolutions and you have to support them all. OASIS does not embedded any geocoding system so you have to encode it all.

Our model was encoded at the most granular level meaning lat/lon and refined areas called INSEE codes but exposure coming at city level or admin levels (Region, Departement, Canton, Commune for France) could not be matched to our model footprint resolution.

We therefore integrated aggregation processes of the hazard to Commune, City, Departement, Region which required assumptions using external information like Surface, Population density, Number of Houses. The results were obviously not acceptable because the aggregated exposure where not decomposed by OASIS as they thought they were. In the end, aggregated exposure were all correlated leading to meaner losses in low return period and higher results in high return period. The aggregation process worked only for detailed single risk.

We hence had to implement a huge pre analysis script which is ran before model run to decompose aggregated exposure. That was clearly not an easy peasy journey as each exposure coded at a country region at to be drill down to probabilistic localization of region, then department, then canton, then city, then code commune to handle any kind of encoding.

Purely technical you shall say, just a mapping problem using assumptions to decompose exposure… Just factor in the decomposition exponential where a location at country level would be decomposed into 36000 probable locations. Exposures data poorly miscoded lead to billions of probable locations and OASIS was not running to the end running out of memory.

To tweek it, we had to use heavy servers to ran each and every aggregated exposure for each given level to detailed one and run optimization to choose the best locations limited to an 100 with at least one subregion represented to best minimize the output error. This georesolution problem and optimization took us almost 3 months to figure out a solution and implement.

The vulnerability problems

The vulnerability is pretty easy, for an exposure and an intensity class, you have a distribution of Ground Up Losses. But wait a residential flat does not have the same vulnerability as a residential home. Ok there are basically 195 OED Occupancy codes so has many vulnerability curves.

Wait a residential home with Roofcover being Thatch, clay, zinc, felt, wood, aluminium don’t react the same. They are 28 different Roofcover that’s 195*28 = 5420 different vulnerability curves.

Lest inject the construction type that’s just 195*28*210 = 1,146,600 vulnerability curves.

In our model we had to inject geographical specificities which just exploded the number of vulnerability curves to hundred of millions vulnerability curves. To solve the problem, we had to highly simplify the problem and override the OASIS LMF process at all level :

Mapping of all occupancies to their general occupancies (RESIDENTIAL, AGRICULTURE, INDUSTRIAL, COMMERCIAL)
Overriding of specific occupancies which have their own vulnerability curves
Overriding of vulnerabilities which are construction designed rather than occupancy designed (mostly for offshore and motor construction types which don’t have a specific occupancy)
Overriding vulnerability curves which have a combination of construction class * occupancy class
Overriding vulnerability curves which have secondary modifiers mainly roofcover in our case.
Overriding vulnerability curves with defined location for a set of generic occupancies

Hence we highly reduced the number of vulnerability curves to few hundreds.

Time Dependant vulnerabilities

The main difficulty we are now facing is the actual adaptation to CROPS which highly differ from property as in in property business insured value is assumed stable over a year while in CROPS, the loss is highly dependent on the time. Considering Winter crops, a hail in June will be highly devastating while a hail in august will have no effect on them (harvest being done) but would destroy spring crops…

Vulnerability becomes a time dependent problem. Not having solved all of it, our prototype was to duplicate each exposure 12 times with a month allocation, create a monthly vulnerability curve based on crop type and assigned and post process detailed results to select the loss of the given month using the occurrence preselection to have it all.

The solution is not ideal at all as it multiplies by 12 the number of exposure to be treated and ran, plus we have to run a post process to select the adequate loss given the month of occurrence. This because OASIS allows a mapping to vulnerability curves for an exposure but not for an exposure cross event or time dependency.

Demand surge modeling

Demand surge is basically an other inflation for major events… After a major event there is a shortage of construction materials for example hails leads to a shortage of tiles. The price of tiles then inflate far beyond normal because of market shortage. The impact if pretty unlinear as there are some inflexation points with binary effects. That can not be integrated into OASIS Models. To do so we also had to post process losses. But given the high load of portfolios we couldn’t keep track of each and every loss with split by occupancy so we had to simplified the vector using return periods thresholds to apply loading factors. The modeling not being ideal we did not delivered it in production yet.

Technical Troubles

Currently OASIS ran either on a server or a docker image. The processing is multiprocessed and we had some back and forth with OASIS to improve some computation blocks. There is still a hit wall, where you won’t be able to run big portfolios depending on your configuration as OASIS is python script and charge both account and location of the OED, should your server not be able to load it all then you end up with a crash and OASIS cannot chunk them. Currently we ran a 32 Gb session with 8 processors, we identified the breaking point at 8gb location file size. Above OASIS will crash not being to load exposure.

In order to increase this OASIS limit we designed a script in R (which consume less memory than python) to load account and location files. We then chunk the accounts and locations into sub OED, ran the OASIS in sequence and reaggregate the results to recompute the EP curves.

The chunking is not ideal as we only have the average mean of events and not sampled losses of events and it takes a long time to chunck run an postprocess but it allows at least to run some big portfolios. We can’t go through huge portfolios at this stage.

Our OASIS whishlist

We still thing OASIS has been the good solutions event there are some missing blocks. We wish OASIS will next develop:

A Standard for Geocoding even if relying on GADM open standard
Allow Occupancies for Cars, Offshore to further simplify model implementation on vulneratlity mapping
Allow secondary modifiers to apply as factors to Ground Up Losses
Implement time dependant vulnerability curves
Allow easy Demand Surge implementation which are mainly factors applying above threshold for some occupancies
Auto chunk OED depending on server memory to optimize the computation time and reduce the rate or crash.

Conclusions

Developing a model has been quite a journey, it took us more than an year, roughly 6 months to develop the model per say and 6 months to implement into OASIS LMF. Going forward an OASIS LMF implementation will be more straight forward knowing limitations and how to deal with.

Should I summarize it, the development and implementation has been a very hard journey far beyond we originally anticipated and sometimes I really doubt we would be able to make it. That when OASIS LMF support gave us a hell of help and support to deliver.

I would like to give a special than to :

Johanna Carter for all the help on the OED spec
Ben Hayes for all the support and coordination
Stephane Struzik for all the technical support/debug and optimization he has been able to deliver in a quick run
Matt Donavan for his engagement
Dickie Whitaker for everything he has done, his sense of humour and support to get things moving

With all all that being said, we keep moving trying to expand our model to broader coverage, we’ll also try to help organize an OASIS Event next year in Paris, stay tune !

Should you be interested or have any question around implementation, please ask me I’m here to help.