Projects for 2012-13

Yield models in college admissions

Description: Accurately forecasting yield (the percentage of admitted applicants who enroll) is an important yet challenging part of a college’s admissions process. Having a reliable yield model helps a college decide on the optimal number and composition of students to admit; admitting too few students can lead to financial pressures, while admitting too many students can lead to pressures on class size and housing. The St. Olaf Office of Admissions is interested in evaluating and potentially improving its current yield models for admitted students. The probability that a particular admitted student enrolls at St. Olaf will vary by individual, but a healthy amount of data is available for each student who applies – from test scores to extracurricular activities to high school attended to financial aid and much more – that can be used to create a yield model. In addition, the effects of individual characteristics on probability of enrollment will be of interest to Admissions staff. Data on enrolled students can also be used to determine predictors of success in a student’s first semester at St. Olaf.

Domain Expert: Admissions Staff (Admissions)

The Role of Metacognitive Assignments in Introductory Biology Courses

Description: Many entering biology students comment that they thought they had studied well and understood the material, but then do not do well on exams. This mismatch between how students expect to perform and how they actually perform suggests these students have trouble self-assessing and therefore may not have the metacognitive skills that other students seem to have developed. Research shows that metacognitive skills are important predictors of academic success, that lower scoring students tend to be less metacognitive, but that students can learn to be more metacognitive. Our metacognitive research project has involved two different introductory biology classes each of which has been taught several different times by different faculty. Each class was divided in half, with specific metacognitive assignments created for one half of the class and slightly different assignments created for the other half. Exam scores as well as responses to metacognitive surveys will be examined.

Domain Expert: Diane Angell (Biology)

Evaluating the Cause and Impact of Hospital Closings


As financial pressures in the health care industry mount, hospital closings have become more commonplace than many believe is optimal. Sometimes this amounts to an entire hospital shutting its doors; other times a hospital will simply shut down a department within the hospital that is consistently losing money. Emergency departments are particularly vulnerable.

In this project, we would investigate the cause of hospital closings by looking at the impact of these closings on surrounding hospitals. One hypothesis is that these hospitals are closing because of poor management. An alternative hypothesis as that these hospitals are treating the least profitable patients (the poorest, the sickest and patients with illnesses that are not generally well-reimbursed by insurers). If the former is true, surrounding hospitals may benefit from hospital closings by accruing a larger market share. If the latter is true, hospital closings will place a financial burden on other hospitals in the surrounding region. The implications of this project could inform policymakers who are concerned with the stability of the health care system going forward.

The data set for this project is a 10-year comprehensive data of all hospitals in the state of California, including individual patient diagnosis and treatment information as well as hospital financial data (data comes from the Office of Statewide Health Planning and Development, OSHPD).

Key steps in this project:

1) Identifying relevant hospital closings in California between 2001 and 2010 and researching potential key factors leading to the hospital closing (this can be done both by mining the data and also by searching news articles near the time of the closing).

2) Identifying substitute hospitals by looking at patient zip code information

3) Using differences-in-differences to evaluate the impact of the hospital closings on various key statistics within substitute hospitals.

Domain Expert: Ashley Hodgson (Economics)

Presidential Battleground Strategies and Their Effects on Citizens’ Political Attitudes and Behavior

Description: This project focuses on presidential elections and in particular, on the fact that presidential candidates tend to campaign only in those states they deem to be battlegrounds, such as Florida or Ohio. They almost completely ignore all other states. Many political commentators and even some scholars have said this is an inherently undemocratic feature of the American political process. This research tries to test this claim by investigating how voters think and feel about politics depending on whether they live in a battleground or safe state. For instance, do voters in battleground states feel better represented? Are they more interested in politics? The statistical work would involve analyzing a 2008 national survey collected by the American National Election Studies (ANES) at the University of Michigan. But as the presidential election happens this fall, we will further analyze trends during that election and utilize the ANES 2012 once it is published in the spring.

Domain Expert: Henriet Hendriks (Political Science)

Analyzing Baseball Strategy


Recently, a great deal of thought has gone into developing new statistics for analyzing player performance. Less thought has go into developing methods for analyzing baseball strategy. For example, what effect does lineup order have on total runs? When is bunting a good idea. Is it better to have a more high “on base” players or more “slugging” hitters?

One way to approach this problem is through simulation, specifically stochastic simulation. In this project, I want to build a computer simulation that represents a probabilistic path through a base game. This is much like von Neumann’s original approach to estimations of neutron scattering at Los Alamos. Instead of trying to analytically determine the values of certain parameters (e.g., average runs scored), the aim is to estimate these parameters via simulation. The key in this simulation will be to develop Markov chain transition matrices for each major league batter and then use these matrices to drive the simulation. The goal is to build a system in which every major league team is represented, arbitrary line ups of players can be selected, strategies implemented, etc. The simulator should be capable of running millions of games in order to accurately estimate the parameters in question.

Background in probability, statistics, and/or computation (particularly in a compiled high level language such as C++) is preferred.

Domain Expert: Matt Richey (Mathematics, Statistics and Computer Science)

The Economic Future of Liberal Arts Colleges

Description: The escalating cost of higher education—what some have described as an affordability crisis–has received increasing attention from the media, Congressional committees, and academic researchers, spurring debate within the sector about both the causes of these cost increases and a search for new models. Although the liberal arts colleges are subject to the same cost pressures, until recently these institutions have been largely silent bystanders to these conversations.Furthermore, most recent empirical work on higher education costs either analyzes cost for the higher education sector as a whole, or limits the disaggregation these data to “private bachelors institutions,” a category much broader than the liberal arts colleges.

In the spring of 2012, this appeared to change when leaders of a group of largely highly selective liberal arts colleges met to discuss “The Future of the Liberal Arts College in America and its Leadership Role in Education around the World.” These discussions, which were widely reported in the higher education press, were highlighted by the acknowledgment of the ever growing cost of educating students at liberal arts colleges, and questions about whether there was anything that liberal arts colleges could do to reduce these costs while maintaining their educational effectiveness and their position in the higher education market.

As important as these questions are for the highly selective liberal arts colleges, they are even more pressing for the many liberal arts colleges that educate the majority of students who attend such institutions, but which have neither the endowments nor high income enrollees that are typical of highly selective colleges. Moreover, the ongoing economic crises, including the reduction in real wages for middle class families, may have an even greater impact on less selective and less well-endowed liberal arts colleges than on their wealthier peers.

The goal of this project is to analyze the economic context of the liberal arts colleges, and to explore how the experiences of the wealthiest institutions may differ from those of the less wealthy. My own target is to prepare a presentation for the annual meetings of the Association of American Colleges and Universities to be held in January 2013. Other opportunities for presenting the work might include the Associate Colleges of the Midwest, and, more locally, the Faculty Conversations organized by St. Olaf’s Center for Innovation in the Liberal Arts.

Very little work has been done in this area, with the most recent example the work by David Breneman in the early 1990s . Data limitations made this early work very challenging, but, fortunately, economic data for higher educational institutions are now available in a very large data set.

• A first task will be to learn how to manipulate these data.
• Subsequently, we will need to construct a subset of these data consisting just of the liberal arts colleges (and this will require making some decisions about what constitutes a liberal arts college).
• We will also need to determine the best way to rank the liberal arts colleges by some measure of financial capacity, and to use this measure to divide the colleges into quintiles, so that we can investigate the different experiences of these groups of institutions.
• We will need to identify an appropriate measure of costs and to plot changes in this measure over time.

• A next step will involve identifying the factors responsible for the cost increases we observe. As part of this analysis, we will need to determine if these factors play the same role in cost increases over time.

Domain Expert: David Schodt (Economics)