A Predictive Model for Coach Firings in NCAA Division I Men’s Basketball

Student: Eli M. Samuelson
Major: Mathematics
Advisors: Dr. Drew Pasteur, Dr. Marian Frazier

Within men’s major college basketball there is a high turnover rate of coaches for teams, either due to retirement, new job prospects, or a firing. Retirement is usually the result of old age, and leaving for a new job comes from the offer of more money or more prestige, but what causes a coach to get fired? This is a common sports question: why does a coach get fired? This question is so common it has become commonly known as the “hot seat” question. The “hot seat” question has been examined in many of the major league sports such as the MLB, and the NFL. We took the question and apply it to NCAA Division I men’s basketball.

Eli will be online to field comments on May 8:
10am-noon EDT (Asia: late evening, PST 6am-8am, Africa/Europe: late afternoon)

48 thoughts on “A Predictive Model for Coach Firings in NCAA Division I Men’s Basketball”

  1. Hi Eli: Congratulations and good work! What made you decide to choose this topic and why is it interesting to you? Did you use any modeling software when working with your data sets?

    1. I originally wanted to sports analytics, and my advisor Dr.Pasteur had some ideas and I thought this one was the most interesting!
      As far as modeling software R is really what I found worked best for me. I used Python some to organize my data. Overall though R was mainly what I used!

      1. In R I used the Random Forest and Logistic regression functions. So for Random Forest that is: library(randomForest) with the function “randomForest” . For Logistic Regression its the “glm” function.

  2. Awesome project, Eli! Are did any additional variables come to mind to include in future analyses?

    1. The next variable I would have added into the model would have been coach salary and related variables. On of the big things about coach contracts is the buyout clause where schools will have to pay a coach a certain amount of money if the coach is fired. This to me would make a difference to see if coaches were fired or not. So overall salary would be the future variables to analyze!

  3. Eli, what caused you to use a logistic regression opposed to other methods? Also, are there any other variables that you considered putting in your equation but decided not to?

    1. So logistic regression is a common method to use in this field. One of the other sources I examined “The analytics of getting sacked: Coach Firings in the National Football League” uses logistic regression for same question applied to the NFL.
      Some variables had really no affect on the model like number of times making the Final Four, and some variables like salary I was not able to fully examine.

  4. Well Done Eli Samuelson. You picked a great topic. Who doesn’t like March Madness. Of course this year was unlike anything we have ever seen. I have a question for you. “In light of COVID-19 Pandemic, what do you think College Athletic Directors and Presidents will do with a marginal Coach who they are not sure they want to keep?”The landscape of sports is so uncertain right now. Thanks. Bob Klumpp

    1. Overall I think this will be a grace period for many coaches. Both models that I found include NCAA tournament variables, so without the tournament I cannot predict anything. This may also be the same thinking that Athletic Directors and Presidents would have.
      For example if a coach had a bad season, and was on the hot seat, but in the conference tournament would have won and made the NCAA tournament then the coach might not have been fired. There are just so many unknown variables that I would think coaches would have a grace period.

      1. Below my advisor, Dr.Pasteur, has made an interesting comment on the situation!

  5. Hi Eli, congratulations on completing your I.S. and best of luck to you on your future endeavors! I found it very interesting in your logistic regression that Sweet 16 appearance was a significant variable. Was there anything in your research that surprised you? Also, were there any other variables you might consider for future work? I know you mention salaries and NBA connection, but curious if there were any other variables you might consider.

    1. So I think the thing that surprised me most would have been the fact that more NCAA tournament variables were significant. In the news win percentage and tournament berths are focused on as coach accolades but were not as significant in my model.
      Conference tournament variables were also another direction I would go. Winning or doing well in a conference tournament could seemingly be a reason to keep a coach.

    2. Colin, it’s great to see you participating today … I can’t believe that it has been five years since you were doing I.S.!

  6. Woohoo Eli! This is some really interesting stuff. What was biggest challenge/obstacle you encountered while doing your IS?

    1. As I feel most of my fellow seniors would say, data collection really is challenging and exhausting at times. Then having to organize that data can feel even longer but it excellent practice for future endeavors!

      1. It seems that this gives a sense of what a career in data science might look like. I recall one of our alums in the field noting that 20% of their time was spent on the actual analysis, and 80% on data wrangling, cleaning, & organization, along with related administrative tasks.

  7. Congratulations on your IS! You mentioned to include a larger data set. What would you include in this? Do you think your results are similar in smaller D1 conferences or even D2 and D3?

    1. So I think I would start with a wider DI conference pool and examine more coaches that way.
      Your question about similar results in DII or DIII is intriguing! And I would think maybe so as the variables would be the same, but maybe minor differences in what is a significant variable. As I was talking to my family about this a similar question came up in the way of DI women’s basketball. I think interests me the most as I trained on DI data, would the women’s DI data work the same?

  8. Hello Eli: interesting work. What was the performance metric that you used to conclude that your models are generalizable, and that logistic regression is better than random forest?

    1. So for my performance metrics I used MSE (Mean Squared Error) and AUC (Area under the ROC (Receiver Operating Characteristic) curve). AUC examines true positive rates and false positive rate to tell how well the model differentiates between different classes, in this case fired and retained coaches.
      As well just by testing the logistic regression model was better at correctly predicting fired coaches in the coaches it predicted as fired.

  9. Hi, Eli, really interesting analysis. What is the coach turnover rate variable – at each individual school, in the conference, or…? Not surprising (and really Darwinian) that the winning percentage seems to have the biggest effect. Did you do any testing of whether the winning percentage and Sweet 16 variables were correlated?

    1. The coach turnover rate is a variable I created. As an example say we look at the coaches who made it to the fifth year, which in my data set was 28 coaches, and take the number who were fired, which was 2, this means that the coach turnover rate in the fifth year was 2/28 or 0.071. I did this for up to 17 years after 17 years there are no coach that were fired in my data-set.
      I did not look at the correlation between win percentage and Sweet 16 variables but that would be an interesting idea!

  10. Hi Eli! This is such an awesome prediction model. Can you talk more about the Random Forest method and how you specifically applied it to your model?

    1. So random forest is what is called an “ensemble learning method” and is good for problems like mine, a classification problem. Basically the forest creates a group of what we would call decision trees. These decision trees are constructed seperate of each other and makes its own decision based on the variables and the data.
      I hope this is enough! Its a bit hard to make it concise!
      But with all those trees inputs it creates a pretty good model which I could use.

  11. Eli, what a fun project! Congratulations!! Was there a particular finding that really surprised you when you ran the functions vs what your intuitive sense of the given coaching situation was? (And do you root for any particular teams or coaches in D1?)

    1. Overall most of the variables made intuitive sense to me just how much they were significant was the biggest surprise! Years coaching and coach turnover rate were very significant in my model more than I had really expected.
      (On, Wisconsin!)

  12. I wonder whether the massive loss of revenue associated with the pandemic (likely tens of millions of dollars, perhaps more, for elite D1 athletic departments) may reshape the salary scale of major men’s college basketball coaches. If the salaries indeed are depressed over the next couple of years, then that could create some interesting dynamics for firing decisions, depending on how individual buyout clauses are structured.

    1. I am now interested also what will happen next year, if we will see a spike in coach firings or if the rate will remain generally constant as it had.

  13. Dear Mr Samuelson,
    Very interesting work.

    If schools did not make money from NCAA tournament success, do you think sweet 16 appearances would still be a factor?

    1. That is a very interesting question and one I had not really thought about before. Though I am not an economics major and don’t know the major theories behind this I would say no. Overall though I think that this question is hard to answer because of how much indirect gains the schools make from the tournament such as incoming student interest. Some students love to watch sports and would take into consideration if a school has a good program that they could watch, and I think administrators know this.
      So overall its hard to know the economic effects of all this, but a very interesting question!

  14. I just wanted to leave a comment saying congratulations!! This is such an interesting study and I’m curious to see how not having March Madness this year will affect coach firings.

    1. I think there will be a grace period for coaches. I think without knowing the conference winning tournament bids would provide coaches an out in the fact that they were not able to show turnaround late in the season.
      But I would say that next season we may see a larger rate of firings because the coach was given the reprieve of this season.

  15. Thank you all for your comments and questions, and for coming even those who did not have questions or comments!

  16. Congratulations to you Eli! Best wishes for your future and please stay in touch! -J. Bowen

  17. One more question Eli. Now that you have completed your Senior Research, would you ever consider coaching at the NCAA level (Division 1,2 or 3) swimming or any other sport based on what you learned? I want to wish you all the success in the future. I am so proud of you, your accomplishments in the classroom and also in the pool. You have a bright, successful future. You were the first Fighting Scot Swimmer I met at Wooster. I hope our paths cross in the future. Good Luck. Bob Klumpp

    1. I personally like examining how coaches do and the data of athletes so maybe as an assistant to a coach or management. I would love to work for larger sports organizations and give advice based on data analytics. That would really be the path I would like to take, but who knows!
      Thank you for your support and encouragement, it means a lot to me! Go Scots!

  18. It is a privilege to see your work. Well done. I am wondering anecdotally how Coaches Chris Collins (.598) and Andy Enfield (.420) survived your model’s computation of firing probability? I don’t follow this sport. Might there be an outlier variable that if considered would pull them in?

    1. Thank you! In this case I think maybe salary would be be a factor here. I did look into Chris Collins and many news sources seem to paint the picture that he was hit with unforeseen events such as player injuries. So data on players there also might help.

      1. Thank you Eli. I figured you would have looked into Coach Collins. Coaches should be watching this project and actively working the variables to keep their positions. SMILES

  19. So proud of you Eli! It has been great to learn and hear and watch as you worked from start to finish on this project. Definitely a great learning experience and lots of hard work.

    I’m sorry I didn’t get to experience IS Symposium in person as planned at the College of Wooster but so glad you presented to the family and participated in this virtual event.

    Congratulations on your IS accomplishments!

    Love,
    Mom

  20. Hi Eli,
    It’s great to see your project here, and to learn more about what you have been working on this year — congratulations on a job well done!
    🙂

Comments are closed.