Information in Local Curvature: Three Papers on Adaptive Methods in Computational Statistics
Doctoral thesis
View/ Open
Date
2020-11Metadata
Show full item recordCollections
- PhD theses (TN-IMF) [18]
Original version
Information in Local Curvature: Three Papers on Adaptive Methods in Computational Statistics by Berent Ånund Strømnes LundeAbstract
Advanced statistical computations have become increasingly important, as with the increased flexibility of models capturing complex relationships in new data and use-cases, comes increased difficulties of fitting procedures for the models. For example, if the model is complex, involving multiple sources of randomness, then the probability density function used in maximum likelihood estimation typically does not have a closed form. On the other hand, in regression type problems the closed form of the assumed conditional distribution of the response is often known. However, the relationship between features and response can be complex, high dimensional and is generally unknown, motivating non-parametric procedures that come with new sets of fitting problems.
This thesis explores techniques utilizing the local curvature of objective functions, and using the information inherent in this local curvature, to create more stable and automatic fitting procedures. In the first paper, a saddlepoint adjusted inverse Fourier transform is proposed. The method performs accurate numerical inversion, even in the tails of the distribution. This allow practitioners to specify their models in terms of the characteristic function which often exists in closed form. The second paper proposes an information criterion for the node-splits, after greedy recursive binary splitting, in gradient boosted trees. This alleviates the need for computationally expensive cross validation and expert opinions in tuning hyperparameters associate gradient tree boosting. The third paper focuses on the implementation of the theory presented in the second paper into the R package agtboost, and also builds on the information criterion to suggest an adjustment of ordinary greedy recursive binary splitting, better suited to gradient tree boosting.
Has parts
Paper 1: Lunde, Berent Ånund Strømnes, Tore Selland Kleppe, and Hans Julius Skaug (2020). Saddlepoint-adjusted inversion of characteristic functions. Submitted for publication in Computational Statistics.Paper 2: Lunde, Berent Ånund Strømnes, Tore Selland Kleppe, and Hans Julius Skaug (2020). An information criterion for automatic gradient tree boosting. Submitted for publication in The Journal of the Royal Statistical Society, Series B (Statistical Methodology).
Paper 3: Lunde, Berent Ånund Strømnes, and Tore Selland Kleppe (2020). agtboost: Adaptive and Automatic Gradient Tree Boosting Computations. To be submitted for publication in Journal of Statistical Software.