Predictions Based on the Clustering of Heterogeneous Functions via Shape and Subject-Specific Covariates

We consider a study of players employed by teams who are members of the National Basketball Association where units of observation are functional curves that are realizations of production measurements taken through the course of one's career. The observed functional output displays large amoun...

Full description

Saved in:
Bibliographic Details
Published inarXiv.org
Main Authors Page, Garritt L, Quintana, Fernando A
Format Paper Journal Article
LanguageEnglish
Published Ithaca Cornell University Library, arXiv.org 11.05.2015
Subjects
Online AccessGet full text
ISSN2331-8422
DOI10.48550/arxiv.1505.02589

Cover

More Information
Summary:We consider a study of players employed by teams who are members of the National Basketball Association where units of observation are functional curves that are realizations of production measurements taken through the course of one's career. The observed functional output displays large amounts of between player heterogeneity in the sense that some individuals produce curves that are fairly smooth while others are (much) more erratic. We argue that this variability in curve shape is a feature that can be exploited to guide decision making, learn about processes under study and improve prediction. In this paper we develop a methodology that takes advantage of this feature when clustering functional curves. Individual curves are flexibly modeled using Bayesian penalized B-splines while a hierarchical structure allows the clustering to be guided by the smoothness of individual curves. In a sense, the hierarchical structure balances the desire to fit individual curves well while still producing meaningful clusters that are used to guide prediction. We seamlessly incorporate available covariate information to guide the clustering of curves non-parametrically through the use of a product partition model prior for a random partition of individuals. Clustering based on curve smoothness and subject-specific covariate information is particularly important in carrying out the two types of predictions that are of interest, those that complete a partially observed curve from an active player, and those that predict the entire career curve for a player yet to play in the National Basketball Association.
Bibliography:SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
content type line 50
VTeX-BA-BA919
ISSN:2331-8422
DOI:10.48550/arxiv.1505.02589