Uncertainty analysis of Supervised machine learning predictions applied to Lithology classification

Shashel, Alina

Shashel, Alina

Master thesis

View/Open

no.uis:inspera:108215571:64987080.pdf (4.603Mb)

URI

https://hdl.handle.net/11250/3027643

Date

2022

Metadata

Show full item record

Collections

Studentoppgaver (TN-IEP) [336]

Abstract

Geosteering is the technique of guiding directional drilling to remain within the pay zone.

This process demands a thorough survey of the lithological properties of the surrounding geo-

logical strata. Since logging while drilling (LWD) tools are positioned a few meters above the

bit, it generates depth lag and, thus, a time delay between what the LWD sensors report to the

surface and the performance of the bit. Drill bit and drill string performance factors are the

earliest markers to determine formations’ characteristics without the temporal delay.

Implementing automated lithology identification would enhance the quality of the geosteering operation. This thesis investigated the extent to which various supervised machine learning

(ML) classification algorithms may be utilized to recognize the lithological features of drilled

formations.

ML models were trained using preprocessed real-time drilling data from the Volve field.

The data included nine wells with a total of 198 928 tagged observations and the accompanying

measured parameters at various depths within the wells. The ML algorithms were tested on the

selected well with a minority of samples presented in the dataset.

The progress in ML algorithms application provides an incentive for more study on model

trustworthiness, including uncertainty analysis, to improve classification algorithms used in

lithology identification. Most ML algorithms may be thought of as "black box" models, mean-

ing that the process by which variables are integrated to form predictions cannot be seen or

transparently understood. Hence, it is required to quantify and limit the uncertainties in mod-

els’ performance to apply ML to real-life classification problems successfully.

Within the scope of this research, Feature Sensitivity and Vulnerability Analysis, as well as

Dataset shift Measurement, were applied to investigate the reliability of ML models. A novel

Black Box Metamodel approach and Bayesian Neural Networks were employed to compute

aleatoric and epistemic uncertainties.

After testing seven ML classification algorithms, the Random Forest and Adaptive Boosting

ones demonstrated the most accurate results and were chosen for comparative reliability analysis.

In classification tasks, it is more crucial to estimate the probability that an observation be-

longs to a specific class than the prediction results. Consequently, the Probability Calibration

techniques improved the quality of the quantified uncertainties. It was proven that the Adaptive

Boosting algorithm with the better scoring results is less confident and ambiguous regarding

epistemic uncertainty than the Random Forest one after calculating and comparing the difference between the confidence and accuracy results obtained after the Probability Calibration.

Publisher

uis