horizontal lines
Gigasheet Primary logo
  • Gianni Perez

Auto Insurance Telematics & Data Analytics


Auto insurance companies everywhere are enthralled by a sweeping new wave of big data analytics based on the systematic observation of meaningful patterns in their customers’ driving habits. Collectively, this discipline is known as Telematics—a term commonly associated with a multi-faceted approach to collecting and processing asset telemetry with the purpose of identifying areas of business improvement as well as growth opportunities.


This concept, borrowed from earlier attempts at reducing the total number of claims, swivels on the increased availability of sensory hardware to deliver an entire set of enhanced predictive models. For years now, the use of telematics has also been keenly suggested in areas ranging from collision avoidance, otherwise referred to as near-miss telematics, to in-vehicle mayday systems, although a handful of more pedestrian use cases points to fuel efficiency metrics based on accelerometer data. Telematics is also the technological bedrock on which financial incentives such as pay-as-you-drive insurance policies are built.


With the availability of so many indicators signaling vehicle usage and driver behavior, data contextualization is key. Part of what makes Gigasheet so unique in this space is its ability to handle volumetric information quickly and seamlessly. As hinted, this is precisely the sort of scenario required when trying to discover hidden patterns in the data, leading to improving analysis.

Auto Insurance Telematics and Data Analytics Using Gigasheet

With this principle in mind, we’ll explore the value of cost-estimating efforts like UBI (short for Usage-Based Insurance) using a proprietary dataset courtesy of a Canadian-based insurance agency, whose portfolio of 70,000 policies generated between 2013 and 2016 served as the groundwork for a University of Connecticut paper on Synthetic Dataset Generation anchored in real-world telematics, with some noted deviations.


Ready ... dataset … go!

The variables that we are mainly interested in suggest that instances such as “Brake.xxmiles”, or “Accel.xxmiles” (to be discussed ahead), can be accurate covariates of driver profiling. With this information on hand, we can start answering relevant questions using our telematics data in true insurance agency fashion; but first things first.


In Gigasheet, loading a few hundred records, or even millions for that matter, in any given format (e.g., CSV or ZIP) normally takes mere seconds:

Uploading files to Gigasheet is easy

And here is the uploaded file in our Gigasheet library:

File in Gigasheet library ready to be explored

To help us advance our visibility of the policyholder population (100,000 in total), we can examine some traditional types, like gender—this can be easily accomplished using Gigasheet’s grouping capabilities:

Use Groups in Gigasheet to summarize data

The grouped data:

Telematics data grouped by sex

The same can be said for “Car use”: let’s see how many categories we have; again, we’ll use the “Group by” property, this time showing the median for both car and driver age as subgroups. The results are as follows:

Telematics data grouped by Car Use with Median aggregations

Doing aggregations in Gigasheet without the need for complex formulas is as intuitive as selecting the column in number format that you wish to work with (the operand), and selecting the operation from the resulting dropdown.

Changing the aggregation calculation of a group in Gigasheet is easy

For example, the “Sum” operation on “NB_CLAIM” above, or the total number of processed claims during observation, gives us an idea of what marital status kind had the most insurance claims overall. Other popular functions in this selection include average, median, min, and max.

Using Gigasheet to explore aggregated data

Finally, let’s take a look at some of the telematics information we have available, and whether we can derive any significant insights from it. For instance, as an underwriter, I’d probably be interested in understanding the relationship between car use (e.g., commercial) and claim amounts across all age groups—this effort can be part of a larger goal aimed at consolidating pricing models, to name one purpose.


From the chart below, we can confirm that vehicles of type “Commercial”, on average, have the largest claim amounts per observation period.

Simple charts are easy to create in Gigasheet

Car use vs. average claim amounts


There is, however, an easier way to achieve similar results: Pivot Tables—a no-nonsense, hassle-free and fundamental way of contextualizing data for quick analysis.

Creating Pivot Tables in Gigasheet is easy!

With Gigasheet, the addition of Pivot tables is as intuitive as dragging and dropping the fields that need transformation onto the appropriate user interface areas, removing most of the guesswork and heavy lifting commonly associated with this stage.

Aggregations within a Pivot Table

Another distinctive scenario may look like this: Driver X is an xx-year-old customer whose combined sum of sudden acceleration and brake events (per every 1000 miles) show that he or she is above the nth percentile—in terms of risky behavior—compared to a sample population of drivers in the same age group.


With Gigasheet, grouping and sorting to obtain this information can be performed in a matter of seconds—perhaps surprisingly, it so happens that not one, but three 87-year-old drivers met the above criteria, placing drivers 28620, 32139, and 50019 well above the 90th percentile:


Using a Gigasheet to drill down to individual drivers in a Group

Exporting these results to match the exact filtering and pivot conditions we’ve established is as easy as clicking on the “File” → “Export” functionality; consequently, the resulting file will not contain anything other than the columns and fields we’re interested in.

Using Save As to create a file with just the data you need

The exported file ready for download:

The Exported File ready for download in the Gigasheet library

Uploading the resulting CSV file to a data visualization tool like Rawgraphs, can help us make further use of our data for cross-validation purposes if necessary. As seen below, our three 87-year-old customers are indeed part of an “elite” group of very risky drivers!

Using Rawgraphs to visualize our analysis

Proportion of sudden braking and acceleration events based on age


Final thoughts

Beyond simple cost optimization or asset tracking, commercial Telematics is under immense pressure as the entire transportation domain undergoes global scrutiny. With sensory-rich applications being the driving force behind any prospective solutions, the field is bound to expand in both size and complexity as we struggle to gain valuable insight from the very data models we employ.


As readers of this blog are well aware, Gigasheet affords the long-standing promise of giving its users an adequate level of data analysis proficiency both quickly and intuitively; an assurance unrivaled by any other off-the-shelf product at the time of this writing.


Give Gigasheet a try today, and see for yourself how big data was always meant to be handled.





Recent Posts