Machine Learning
Fraud Detection in Car Insurance – Random Forest
University project · December 2025 – January 2026
Detecting fraudulent insurance claims using machine learning and pattern recognition.
Project Overview
This project was part of a university machine learning project, where multiple approaches were explored to detect fraudulent car insurance claims.
While other team members experimented with different algorithms, my focus was on implementing and optimizing a Random Forest model to identify suspicious patterns in the data.
The overall goal was to compare different machine learning techniques and evaluate their effectiveness in detecting fraud.
Data Preparation
A key part of the project was preparing the dataset for machine learning.
- Cleaning and preprocessing raw data
- Handling missing values
- Encoding categorical features
In addition, I applied feature engineering and feature selection to identify the most relevant variables influencing fraudulent behavior.
Model: Random Forest
The core model used in my part of the project was a Random Forest classifier.
This ensemble method combines multiple decision trees and is particularly effective for classification tasks involving complex and non-linear relationships.
- Robust against overfitting
- Handles non-linear patterns
- Works well with tabular datasets
Collaboration & Comparison
As part of the team project, different models were developed and compared across the group.
This allowed us to evaluate how various algorithms perform on the same dataset and better understand their strengths and weaknesses in fraud detection scenarios.
Training & Insights
During model training, I analyzed which features had the strongest correlation with fraudulent behavior.
This provided valuable insights into how certain patterns and variables contribute to suspicious claims.
Evaluation & Optimization
The model was evaluated and optimized to achieve a strong balance between:
- High detection rate (recall)
- Low misclassification rate
In fraud detection, minimizing false negatives is especially critical, as undetected fraud can lead to significant financial loss.
Key Learnings
- End-to-end machine learning workflow
- Working in a collaborative ML project
- Comparing different algorithms
- Importance of evaluation metrics beyond accuracy
Conclusion
This project provided hands-on experience with real-world machine learning challenges in the insurance domain.
It demonstrates how different machine learning approaches can be applied and compared to solve complex classification problems like fraud detection.