STEM Success Initative Team

AMRE | Stem Success Initiative

October 22, 2020   /  

Prediction of Retention and Persistence

Brendan Dufty ’22, Math
Sky Gill ’22, Statistical and Data Science
Bang Nguyen ’22, Computer Science
Ariel Xie
’21, Math and Physics

Advisors: Rob Kelvey, Jillian Morrison

With the goal of identifying predictive factors for STEM student retention and persistence at The College of Wooster, the team used data from across The College to create and analyze statistical models.

This project was made possible by financial support from the following organizations: Hamburger Endowment for Collaborative Projects and Program Development and The Stem Success Initiative.

Members of Stem Success Initiative will be online to field comments on Nov. 5 from 11am-1pm.

51 thoughts on “AMRE | Stem Success Initiative”

  1. So great you were able to focus on the SSI for the College. Your presentation style is wonderful and helped me understand what the four of your accomplished.

    1. Thank you! It was great to be working on something that directly impacts the College and we are glad you enjoyed our project!

      1. This is great! I’m a wooster physics alum and teach college physics now. (I also grew up in Westfield Center!)

        My question is: You’ve classified underrepresented groups by race/ethnicity, and considered gender separately. Is it possible or would it make sense to take into account the degree to which gender is also an axis of underrepresentation? For example, women are far more underrepresented in physics and CS than biology or chemistry. Or are the numbers too small for this to help? Thanks!

        1. Thank you for your comments and question! (Also hello to another Westfield Center resident!)

          We did look into gender rather extensively at the beginning of our analysis and we found that overall in STEM at Wooster, men and women are about equally represented. However, we found, as you said, that some majors (like CS and physics) have far less women than men, while others show the opposite gender representation. We were using the SSI’s definition of underrepresented which does not include gender, so while we did look at some differences between gender, we did not include this in the definition of underrepresented. Additionally, because we were mostly looking at STEM broadly at Wooster, there is not a huge gender gap. Future research into specific majors more deeply, should definitely take gender into consideration. Lastly, as you said, the numbers were quite small for many majors, so even if we wanted to include gender, we feel our sample size may not be big enough to produce accurate results.

        2. We did some data analysis surrounding gender and underrepresentation, and since we found that the SSI’s definition of UR students tended to be significantly more underrepresented than women in STEM, we primarily focused on those students. Additionally, the STEM fields as a whole have the same gender breakdown as the College does.

          However, you are absolutely right that certain disciplines have immense female underrepresentation. However, this majors tended to be very small and most of our analysis and modeling was focused on the entire STEM field to have the largest possible data set to draw from.

          I hope this answers your question.

        3. Thank you for your comments!

          We did look at underrepresented status from a variety of aspects – gender, racial identity/ethnicity, and first generation. We considered each of the aspects separately because they have different relationships with persistence in STEM. As you said, women are underrepresented in Physics and CS. On the other hand, whether a student is a first generation is related to their persistence in Biology. Of course, there are more to underrepresented status such as LGBTQ+ identity and so on but unfortunately we do not have the data available. That said, we hope to address underrepresentation from different perspectives that best explain student persistence in each STEM major.

  2. Thank you for the presentation and work on this project. I have a couple of questions:

    The primary conclusion, as I understand it, from analyzing the data in scope was that students with a higher mean grade that take more STEM classes have a higher retention. While the conclusion that doing well and being consist in STEM creates retention is meaningful, what other datasets could have been used to enrich the existing dataset to explore the reasons for students doing well and desiring consistency in STEM classes?

    As a followup to the first question, if you knew the professors student had across all their STEM classes, how might you have included that in your modeling?

    As I understand it, you had data to look at a students journey and progression through STEM classes starting with entry level and on advanced. Did you consider the use of neural networks to model the prediction of a student progressing based on all the STEM classes in their class network?

    Thanks for the thoughts.

    1. Thank you so much for your comments.

      We spent a lot of time looking for additional datasets to add to our modelling, and what you see here is just the most significant results we were able to find. Of course, beyond that, there are always more variables and more data sets for us to investigate. Since we were only looing at courses related to the SSI, I think that having access to more higher level classes might have shown us some interesting things.

      We actually did know which professors taught each course, and we ran explored that section of the data, but found that it did not help us answer the questions of retention and persistence that we were trying to answer.

      Finally, we did use networks as a visualization tool, but we did not have enough time to consider or implement neural networks as a model. Bang and Ariel’s models are based on algorithmic prediction though!

  3. Great work, and a very good, concise presentation. The conclusions, as well as the framework you have created (which can be used in the future with another generation of data), should be very helpful to the College and SSI in particular.

    1. Thank you! We are glad that you enjoyed our work and it is absolutely our hope that our findings and models will be useful in the future!

  4. Thank you so much for your comments.

    We spent a lot of time looking for additional datasets to add to our modelling, and what you see here is just the most significant results we were able to find. Of course, beyond that, there are always more variables and more data sets for us to investigate. Since we were only looing at courses related to the SSI, I think that having access to more higher level classes might have shown us some interesting things.

    We actually did know which professors taught each course, and we ran explored that section of the data, but found that it did not help us answer the questions of retention and persistence that we were trying to answer.

    Finally, we did use networks as a visualization tool, but we did not have enough time to consider or implement neural networks as a model. Bang and Ariel’s models are based on algorithmic prediction though!

    1. Thank you Dr. Pasteur. We are glad that our results and analyses can be helpful in supporting STEM students, especially those that are underrepresented.

  5. I have some questions:
    1. Did you look at the effect of STEM Success Initiative programs (specifically, say, the number of visits to the STEM Zone) on retention and persistence?
    2. Did you censor your date to exclude students who did not persist because of external factors (e.g., transferring or dropping out after first year for health or economic reasons)?
    3. I was not clear (perhaps because it went by quickly) whether you were able to censor your data to focus on incoming students who expressed an interest in a STEM major. Did you?
    Thank you so much!

    1. Thank you for your questions!

      1. We did look at the number of visits to the STEM Zone, the “Total Help Visits” variable you see in the logistic model results is the number of STEM Zone visits plus the number of visits to the Math Center. We found that the number of help visits and chances of retention and/or persistence have a positive relationship (as one increases, so does the other).

      2. We did not include these students in retention analysis because the definition of retention we used is “starting in STEM in your first year and graduating with a STEM degree,” therefore, students who dropped out or transferred would not be included since they did not graduate from Wooster.

      3. We were able to do some analysis using data from admissions, which included whether a student was interested in STEM. This data from admissions was only from the class of 2018 onward, so as more data is collected in the future, models using this data will become more accurate.

    2. Thank you so much for your comments. Here’s some answers to your questions.

      1. Yes, we did look at the visits to STEM ZONE. However, our data only applies to Chemistry department and it can not be used to predict retention. We also did some analysis within chemistry department using the STEM Zone visits when doing the data exploration at the beginning. It turns out that the visits doesn’t show clear relevance with grade . Another problem is that we have a lot of NA rows. Hence, in order to include this column, we will have to filter out a lot of students, which we don’t want to do because our data is already very limited. Eventually we decided to focus on pre and post SSI instead of specific programs, but I think this is definitely a great thing to be explored in the future once we have more data.
      2. Yes, we did filter out these students in our analysis.
      3. We had build models to predict retention, but both models need first-year data. We had tried several potential models using only high school and application data. Unfortunately, they are not behaving good enough making prediction.

  6. Very nice job, Team! I appreciated hearing some of you speak about working remotely this summer and would love to hear how you’re applying the skill and strategies you learned to your classes at Wooster this fall.

    1. Thank you very much, Cathy! We’ve learned to use Teams to have meetings in the most effective way depending on the occasion. For example, we would use the chat box to ask quick questions and leave short updates but would share screen to work on coding complicated problems (networks) together. This variety of experiences in Teams has really helped us adapt to the different types of virtual discussion that we have in class this Fall.

    2. Thank you for your comments! Personally, I feel like I’m a lot more comfortable and familiar working in groups and problem solving remotely because of this internship, which has been very helpful this semester. Also, as I continue with my data science classes, I have been able to apply some of the coding skills I learned!

  7. SDS Team. What you all did for the College is really important work as we try to stay relevant to changes in our culture and in the world of higher education. Thanks for your energy and focus during such a very strange AMRE summer. You have obtained some really good skills in the remote consulting that you did and don’t hesitate to bug us if you need a reference for further education or work! (Or if you want to just gab and talk about the ice cream that you didn’t get to eat.)

    1. Nothing would make me happier than gabbing with you about ice-cream we didn’t eat Dr. Ramsay!

      AMRE was an amazing opportunity and I am so happy I was able to take advantage of it.

  8. Very impressive work, Ariel, Bang, Brendan, and Sky! Your work will no doubt have a positive impact on SSI and the College as a whole– Thank you!

    1. Thank you very much, Mae! We are excited that the SSI and the College can use our results to better support students in STEM.

  9. I actually tried to write something funny, but it didn’t work because I used ‘<' symbols. Ah well. Good job!

  10. Hello Brendon, Sky, Bang and Ariel,

    Firstly, congratulations on completing a successful AMRE project and that too remotely! What a difficult journey it must have been to work on a predictive model when the entire mode of teaching and learning was changing.

    I am an AMRE Alum from 2016 and we were a group of Social Science majors. We worked on a project predicting Student Retention in general (not just for STEM). We found many qualitative factors that predicted student retention such as sense of belongingness and personality. Although not directly related, did you use our project for reference?

    It is interesting to learn that number of first year STEM classes are a major indicator of STEM retention. It would be even more interesting to explore if you can control for cultural influence. Hypothetically, an under-represented student would be more likely to take less STEM courses in the first year and then over time with our wonderful and supportive resources gain confidence and begin taking more classes. (Yes, I was a Psychology double major). What are your thoughts on this?

    Great work team! And all my best wishes to you in your future at Wooster (yay you did it).

    1. So SSI will now encourage an under-represented student and women to take more classes in their first year or improve their mean grade in order to retain them in STEM, correct?

      1. Sadly, we were not aware of your project so we did not reference your project.

        For your last two points, our models are not precise enough to fully answer the question about how an underrepresented student will take classes. Additionally, we have no idea how the SSI will use and interpret our models. We simply analyzed data and created predictive models, our goal was not to come up with solutions or plans of action.

        1. Spoken like a true consultant! AMRE has taught you well. Thank you for answering my questions, that makes sense. Maybe as a follow up, this is an analysis SSI can take up to compliment your findings.

          1. Just adding on to Brendan’s comment on your first question, the qualitative factors related to student retention and persistence are indeed interesting to look at and we did try including that in our analysis using data from the SSI. At the end of every semester, the SSI will send out a survey that asks students about their identities as STEM students and how connected they feel to the STEM community. However, our analysis did not find any major relationship with retention and persistence. Also, the data is only available for some of the classes in recent years. So, we hope with more data in the future, more interesting results can be found!

  11. Well done!
    I am a physics alum and high school physics teacher.
    I would be interested in the response to alum Amy Lyle’s question above.
    Thank you!

    1. We did some data analysis surrounding gender and underrepresentation, and since we found that the SSI’s definition of UR students tended to be significantly more underrepresented than women in STEM, we primarily focused on those students. Additionally, the STEM fields as a whole have the same gender breakdown as the College does.

      However, you are absolutely right that certain disciplines have immense female underrepresentation. However, this majors tended to be very small and most of our analysis and modeling was focused on the entire STEM field to have the largest possible data set to draw from.

      I hope this answers your question.

  12. So these questions are for Bang, Sky, Brendon, or Ariel:

    Once you guys applied and were accepted to work for AMRE this past summer, were you guys randomly put into this group for this project, was it based upon similar interests etc.? Also, how did you guys decide that you wanted to do your project for the STEM Success Initiative (SSI) here at The College of Wooster?

    1. Hi Burim! Thanks for the questions!

      We did not choose the project ourselves but were assigned by the AMRE committee based on the set of skills and interests that we demonstrated in our application. I hope that answers your question!

  13. This is an interesting and important project in support of STEM success – you should be very proud of your work!

    Congrats!

Comments are closed.