Student tuition fees and disadvantaged applicants

Those of you who have known me for a while will remember that I used to blog on the now defunct Dianthus Medical website. The Internet Archive has kept some of those blogposts for posterity, but sadly not all of them. As I promised when I started this blog, I will get round to putting all those posts back on the internet one of these days, but I’m afraid I haven’t got round to that just yet.

But in the meantime, I’m going to repost one of those blogposts here, as it has just become beautifully relevant again. About this time last year, UCAS (the body responsible for university admissions in the UK) published a report which claimed to show that applications to university from disadvantaged young people  were increasing proportionately more than applications from the more affluent, or in other words, the gap between rich and poor was narrowing.

Sadly, the report showed no such thing. The claim was based on a schoolboy error in statistics.

Anyway, UCAS have recently published their next annual report. Again, this claims to show that the gap between rich and poor is narrowing, but doesn’t. Again, we see the same inaccurate headlines in the media that naively take the report’s conclusions at face value, and we see exactly the same schoolboy error in the way the statistics were analysed in the report.

So as what I wrote last year is still completely relevant today, here goes…

One of the most significant political events of the current Parliament has been the huge increase in student tuition fees, which mean that most university students now need to pay £9000 per year for their education.

One of the arguments against this rise used by its opponents was that it would put off young people from disadvantaged backgrounds from applying to university. Supporters of the new system argued that it would not, as students can borrow the money via a student loan to be paid back over a period of decades, so no-one would have to find the money up front.

The new fees came into effect in 2012, so we should now have some empirical data that should allow us to find out who was right. So what do the statistics show? Have people from disadvantaged backgrounds been deterred from applying to university?

A report was published earlier this year by UCAS, the organisation responsible for handling applications to university. This specifically addresses the question of applications from disadvantaged areas. This shows (see page 17 of the report) that although there was a small drop in application rates from the most disadvantaged areas immediately after the new fees came into effect, from 18.0% in 2011 to 17.5% in 2012, the rates have since risen to 20.5% in 2014. And the ratio of the rate of applications from the most advantaged areas to the most disadvantaged areas fell from 3.0 in 2011 to 2.5 in 2014.

So, case closed, then? Clearly the new fees have not stopped people from disadvantaged areas applying to university?

Actually, no. It’s really not that simple. You see, there is a big statistical problem with the data.

That problem is known as regression to the mean. This is a tendency of characteristics with particularly high or low values to become more like average values over time. It’s something we know all about in clinical trials, and is one of the reasons why clinical trials need to include control groups if they are going to give reliable data. For example, in a trial of a medication for high blood pressure, you would expect patients’ blood pressure to decrease during the trial no matter what you do to them, as they had to have high blood pressure at the start of the trial or they wouldn’t have been included in it in the first place.

In the case of the university admission statistics, the specific problem is the precise way in which “disadvantaged areas” and “advantaged areas” were defined.

The advantage or disadvantage of an area was defined by the proportion of young people participating in higher education during the period 2000 to 2004. Since the “disadvantaged” areas were specifically defined as those areas that had previously had the lowest participation rates, it is pretty much inevitable that those rates would increase, no matter what the underlying trends were.

Similarly, the most advantaged areas were almost certain to see decreases in participation rates (at least relatively speaking, though this is somewhat complicated by the fact that overall participation rates have increased since 2004).

So the finding that the ratio of applications from most advantaged areas to those from least advantaged areas has decreased was exactly what we would expect from regression to the mean. I’m afraid this does not provide evidence that the new tuition fee regime has been beneficial to people from disadvantaged backgrounds. It is very had to disentangle any real changes in participation rates from different backgrounds from the effects of regression to the mean.

Unless anyone can point me to any better statistics on university applications from disadvantaged backgrounds, I think the question of whether the new tuition fee regime has helped or hindered social inequalities in higher education remains open.

14 thoughts on “Student tuition fees and disadvantaged applicants”

  1. Thanks for this interesting analysis.
    There was a time when the statistics were based on SES of the individual, as determined by parental income, etc. Postcode data is obviously much easier to obtain, but there is then the problem of deciding how to relate it to SES. Not only do we have the problem you have noted, of regression to the mean when a particular year is used to calibrate this, but also it’s just a very rough index. There’s an interesting report on medical students showing that most come from high SES backgrounds, regardless of the ‘neighbourhood’ categorisation. You can find it by googling: “Socioeconomic status of applicants to UKCAT Consortium.” This suggests we really do need some data on individualised SES before we can conclude much about relative rates of application to university.

    1. Yes, Dorothy, you make a good point there. While postcode data can sometimes be a useful proxy for SES, you’re never going to get such good results as if you actually look at the individual SES data.

      I think if they had used such data in this report, we’d be a lot closer to answering the question about whether the gap between rich and poor really is narrowing.

  2. The same UCAS End of Cycle report says that entry from FSM eligible has consistently increased (Figure 82) from 9.1% in 2006 to 11.4% in 2010 and 15.3% in 2014 so looks like real improvement.

    Though it is of course plausible some of the improvement is due to higher post-16 education participation and better A-level results rather than improved chances of university entry given results (haven’t seen any figures either way on this)

    1. Interesting point, Peter.

      I had a look at the numbers behind figure 82. The proportion of applicants has increased in both the FSM group and the non-FSM group. Although the ratio between the 2 has decreased (thus supporting the claim that the gap between rich and poor is narrowing), the drop in the ratio is not large: from 2.2 in 2011 to 2.0 in 2014.

      I think before I read too much into such a small change, I’d want to be really sure that the eligibility criteria for FSM hadn’t changed over that time period. Don’t suppose you happen to know anything about that, do you?

      1. I agree the ratios haven’t changed much but it seems clear that the proportion eligible for FSM has increased significantly from 2010-11 (i.e. pre-fee reform) making it difficult to stack up an argument that disadvantaged school leavers have been discouraged from applying (notwithstanding the hypothetical change in composition of FSM population you raise – might be something in that as we’re comparing 2007-08 GCSE cohort with 2011-12).

        Of course, there’s a possibility that progression conditional on A-level grades has decreased (though as I said I haven’t seen evidence either way) and outcomes for older students (more likely to be disadvantaged having a “second chance” at HE in their 20s) have worsened significantly (e.g. I think I’m right in saying part-time entrants have halved in England)

  3. Assuming UCAS still asks applicants for information on parental occupation, it would be possible to use the NS-SEC to assess the issue of whether those from disadvantaged backgrounds are or are not being put off by higher fees.

  4. Hmm, that’s interesting. Is it possible that an area which was disadvantaged in 2004 under the measure of “proportion of young people participating in higher education during the period 2000 to 2004” may no longer be as disadvantaged? Maybe it has experienced some kind of gentrification..

    Sorry if that’s a daft thought. My gut feeling is that there is very little change of university access from the poorest communities, and I suspect it has little to do with the way that student loans/grants are arranged.

    1. Not a daft thought at all, Joe, in fact that’s exactly the problem. If a postcode area was in the most disadvantaged group in 2000 to 2004, then if it’s going to change at all, it can only go in one direction.

  5. David makes a good point. UCAS still collect data on parental occupation- there is a break in the series in 2008 when the question changes slightly. But, surprise, surprise in 2009 UCAS stopped publication of application and admissions data on the basis of individual level parental NS-SEC in favour of the crappy post-code based measure. Over and over again I find this sort of thing. One arm of government bangs on about social mobility while another refuses to release into the public domain the information needed to evaluate the facts of the matter. A cynic would say that they aren’t really serious…

    1. HESA still publish data based on NS-SEC in their widening participation PIs, though data is only available up to 2012-13 entry (I think 2013-14 data is coming in March)

      It’s what the Social Mobility and Child Poverty Commission used in its report on trends in access to Russell Group institutions back in 2013 (though all that data is pre-fees reform)

      I think UCAS discontinued use of NS-SEC as there are ongoing concerns about data quality (it’s obviously a subjective variable based on respondent’s own intepretations of parental occupation rather than the objective postcode/FSM measures).

  6. Hi Adam – belatedly, here’s the promised comment to follow up our interesting exchange on Twitter.

    Let’s start with the dice rolling analogy I used. Imagine I roll a dice, get a 1, say ‘oh there’s some extra weight on one side which is biasing the dice so I’ll give it a polish’, then roll again and get a higher number.

    We can both agree that due to regression to the mean, there’s nothing significant about the polishing.

    But supposing I role the dice a billion times, getting a 1 every single time. Then I do my magic polish and get a 4.

    This time round it’s much more plausible that the polish made a difference – because at some point the repeated result below (what we think is) the mean indicates that it isn’t actually the mean for the dice.

    After a billion rolls getting 1, we know the dice isn’t a normal average dice with a mean of 3.5 but is rather a disadvantage, loaded dice destined to a life of 1s unless we intervene and do something.

    That seems to me the situation with the application data. Some areas have been consistently below average – not a billion times, but more than once. It’s the very repetition of their below average scores which indicates that actually they’ve not got the same average as other areas.

    Hence an improvement in their scores isn’t a ‘regression to the mean’ as that’s someone else’s mean. Rather it’s an increase in their own scores which – as long as the increase is big enough to be statistically significant (which it looks to be in this case) is genuine change rather than random variation.

    1. The problem here is that it’s not a simple either/or situation. It’s not just pure luck or predictable characteristics. It’s very likely to be a mixture of both.

      So of those areas classified as the most disadvantaged, some will truly be the most disadvantaged, while others will not be quite so disadvantaged, but will have just by chance had low numbers of young people going to university. And the problem is that it’s very difficult to disentangle the effects.

      Let’s use your dice analogy. Imagine you throw thousands of dice. And imagine they’re not necessarily all fair dice: some may be more likely to come up 1, others may be more likely to come up 6. But it’s just a bias: it’s not complete predictability. Some of the dice biased towards 1 may come up 6 from time to time.

      Now let’s imagine you classify those dice on the basis of how often they came up 1 after you threw them all a few times. Some will come up 1 every time: maybe they were biased towards one, maybe they were just unlucky. After all, you’re throwing thousands of them, so there’s plenty of scope for luck.

      If you take those same dice and throw them again, you’ll probably get higher scores next time. I think that’s a better analogy than your die that you’re rolling a billion times. You have very many dice, and you’re only rolling them a small number of times each.

      1. That’s a good extension of the dice analogy. We agree, I think, that somewhere between rolling the dice twice and a billion times the best explanation switches from RTM to genuine change.

        The question is, how many? My rough calculation when originally tweeting was that if we assume a typical random normal distribution, then for an under-average area one year to appear under-average the next there is a 1 in 2 chance, for a further year that becomes 1 in 4 chance of it always being under average- and so after just one initial year and then, say, five further years we’re at odds as low as 1 in 32, which is less than that favourite 5% rule of thumb.

        I freely admit that’s just a rough scoping calculation but one which gave me the feel that it’s very plausible, indeed very likely, that what’s happened is not RTM.

        As your original comments about it not being RTM were fairly absolute I guess you did some calculations to work this out (probably more rigorously than I!). What were the sums you crunched?

        1. You’re just thinking about a single area. There are a great many areas. If we take your calculations as a rough first approximation, then if you have 32 areas, then you’d expect one of them to be below average for 5 years.

          I’m not sure off the top of my head how many postcode areas there are, but I’m pretty sure it’s a lot more than 32.

Leave a Reply

Your email address will not be published. Required fields are marked *