Thoughts on Education, Technology and Development: # LAK11 Week 2 The Rise of Big Data in Education

Learning analytics is frequently hailed as the ultimate arrival of a data-driven paradigm in education. The successful application of randomized trials in medicine, for example, could be transplanted to education. Randomized trials mean that a sample is randomly divided into a group who gets the treatment and a control group who doesn’t get the treatment (but a placebo). Analysis of variance reveals if the treatment works or not. David Ayres, a law professor at Yale Law School, shows in his book Super Crunchers that randomized trials already have wide appeal outside medicine. eHarmony, a matching agency, uses large data sets and regression analysis to analyze which combinations of personal traits match together in order to present people with a better match. Internet firms such as Amazon and Pandora, provide entertainment advice based on existing tastes or purchaes. IBM’s “Smarter Planet” programme, provides plenty of examples of using analytics in city planning or environmental management.

Drivers are bigger data sets and better data sets, since data are more frequently collected in real time by machines, making them more reliable than data collected by questionnaires. Collecting and storing them also becomes more cheaply. In education randomized trials can analyze the effect of certain learning resources, like a video or an animation, on student learning. Changes in curriculum structure could be analyzed quantitatively.

Ayres points out that randomized trials and regression often do a better job than experts. Quantitative analysis of Supreme Court verdicts proved a better predictor than the judgments of experts. A regression model with three variables developed by Orley, an economics professor at Princeton, for predicting the quality of Bordeaux wines predicted outperformed wine experts.

The main reason is that experts can’t quantify the role of variables. Of course, experts also know that winter rainfall and temperature affect the quality of wine, but they can’t accurately assign weights to their influence. The more complex the situation, the better the model performs and the worse experts are in predicting.

For all its successes, though, statistical analysis continues to face tremendous skepticism and even animosity. For one thing, Ayres notes, statistics threaten the “informational monopoly” of experts in various fields. But even to many people without a vested interest, relying on cold, hard numbers rather than human instinct seems soulless.

Learning Analytics raises other critiques as well. Privacy is a major issue. Do learners have the right to access the data that are gathered about them, or do they have the right to deny that data are collected about them? For example, students could reasonably be skeptically towards data being collected about their off-task behavior. Another issue is that there might be a tendency to take only those elements into account that can be measured. This is called the McNamara fallacy, and in its original form says:

The first step is to measure whatever can be easily measured. This is ok as far as it goes. The second step is to disregard that which can't be easily measured or to give it an arbitrary quantitative value. This is artificial and misleading. The third step is to presume that what can't be measured easily really isn't important. This is blindness. The fourth step is to say that what can't be easily measured really doesn't exist. This is suicide.

Some elements in education are difficult to measure, like whether deeper learning takes place, how learners are engaged with the materials or what the longer term effects are of learning interventions.

Predictive models will also do a worse job when the field or subject is evolving quickly. In Cambodia for example, characteristics of student populations are changing rapidly. They become richer, more technologically savvy and more technologically literate. Predictive models in this context would need to be updated very frequently.

Other critiques formulated on the Moodle forum, and summarized by George Siemens, seemed to be more emotionally inspired and I found them less grounded. Learning analytics doesn’t mean that all complexity is reduced to numbers, nor that ambiguity is no longer accepted. Ayes pointed out that models not only predict, but also indicate the precision of the model. If the phenomenon is very difficult to predict, the correlation will say so. This counters another critique, that models don’t take the uniqueness of humans into account. However, models and correlations can be wrongly interpreted or uncertainty in the data can be ignored, but this seems hardly a problem specific to learning analytics alone. It is a misconception that expressing in numbers automatically conceals uncertainty.

This is not to say that statistics cannot be misused, intentionally or unintentionally. Viplav Baxi provides an excellent overview on the LAK11 forum:

At the very basic level, there are many arguments for or against statistical analyses and other forms of analytics (such as those generated by "intelligent" systems). The arguments address generalizability (do the analytics imply that we can take general actions and predict outcomes), appropriateness (are the analytics appropriate to generate for the domain under consideration), accuracy (did we have enough information, did we choose the right sample), interpretation (can we rely on automated analytics or do we need manual intervention or both), bias (analytics used to support an underlying set of beliefs), method (were the methods and assumptions correct), predictive power (can the analytics give us sufficient predictive power) and substantiation (are the analytics supported by other empirical evidences).

An interesting quote was formulated by Chriss Lott, who fears that learning analytics becomes the new buzz-word on the block and may spawn another “cottage industry of repetitive pundits”. Indeed, who will interpret all the data gathered during an online course? Teachers, tutors, administration, external companies or… the learners themselves? Tony Hirst expresses the desire that learning analytics would be used creatively to give learners more control on their learning.

To me, two weeks of reading about learning analytics have offered a tantalizing glimpse of its potential, but without forgetting some concerns. Or, to end with a quote from Bill Fitzgerald, “I would hope we could outgrow our pursuit of silver bullets”.

Thoughts on Education, Technology and Development

22 January 2011

# LAK11 Week 2 The Rise of Big Data in Education

No comments:

Post a Comment

Search in this Blog