Discover more from MOD 171
What You Want from Tests
When you use tests, what do you want them to measure?
[I originally wrote this around December 2019, when I was teaching courses as a PhD student at NYU, and I’m reposting it here for reference. Since I currently teach at Hampshire College, which has no tests, this is not very relevant to me at the moment.]
I used to be in favor of open-notes tests. But after seeing them in action for a while, I realized that I don’t think that they’re a very good idea.
It’s true that traditional tests don’t do a good job accomplishing what they are designed for. It’s good to see people exploring different ideas about what tests can be. But an open-notes approach doesn’t fit very well with the strengths of test taking.
Tests have some natural strengths and some obvious weaknesses, and if we understand these strengths and weaknesses, we can design tests with more precision. Settling for the open-notes approach keeps tests from becoming all they can be.
The traditional argument in favor of open-notes tests is that having access to your notes is more true to life. In the real world, the argument goes, you aren’t often locked in a room without your books and forced to answer questions under a time limit. You have access to whatever resources you need, and can look things up as you go.
Einstein famously was unable to remember the speed of sound, when given the Edison Test. Why memorize such facts, he remarked, when one could easily look them up in a textbook?
This perspective is entirely correct. Skill involves the use of more than just what one carries around in one’s head. An expert makes use of many tools and will refer to a variety of sources when solving a problem. In many ways, skill in a domain is skill at using the reference works of that domain. Hence the old joke that programming be renamed “Googling StackOverflow.”
Take this view too far, however, and you end up with absurdity. It’s clear that experts don’t carry everything around in their head. But it’s also not true that they carry nothing around in their head.
A physicist may not be able to tell you the speed of sound without looking it up. But every physicist will be able to tell you who Maxwell and Newton were, and a little bit about their contributions. If someone doesn’t know what F = ma means, they’re probably not a physicist.
A programmer won’t be able to recall from memory the exact workings of every function they’ve ever used. But every programmer will be able to tell you the syntax for writing a for loop in their favorite languages. If someone can’t tell you the syntax of an if statement, they’re probably not a programmer.
An expert is someone who is able to do both. Some things they will know by heart, and some things they will be able to accomplish only given time and resources. You need both to have mastery of a skill. We might call these two forms of knowledge what you carry around in your head and what you can accomplish.
We don’t expect students to leave a class as an expert in their field, but we do expect them to have mastery of the material.
What does mastery mean? I think that mastery involves both of these skills.
Someone who can accomplish a task but doesn’t carry any of that knowledge around with them is following a guide, or a set of instructions, without any understanding. Someone who can tell you important facts about a field but can’t accomplish anything is a fan, not an expert.
Students shouldn’t be expected to memorize everything. We should understand that they will do their best work when they can use their notes, look things up, and take time to consider multiple angles on a problem. But we should expect them to carry certain very important facts around in their head wherever they go.
I don’t care if a student leaves my statistics class without memorizing the equation for a t-test. They can look that up. But if they can’t read a scatterplot, that’s a problem.
To evaluate a student’s mastery of a class, we want to measure both of these kinds of knowledge. We should give them the chance to demonstrate real skill in the field, but we should also require them to show that they have internalized some of the most important facts and concepts.
Luckily we already have good ways of doing both.
Tests isolate the student from their resources and have the potential to measure the information that the student actually carries around in their head.
Class projects and papers allow students to use whatever they want in the solving of an actual (if usually artificial) problem, and have the potential to measure the student’s ability to accomplish practical work in the field.
If tests and projects are designed with this in mind, the class can run smoothly. If they are not, the result is disaster.
What are the important features of a test? Well, they happen in a controlled environment. You can’t choose what you’re working on; the questions have been decided for you. You have a limited amount of time. You’re not allowed to collaborate with other people. And you’re not allowed to look anything up.
Open-notes tests relax this last criterion. Some of them relax it in a small way; often students are given a formula sheet, or are allowed to bring a page or a note card as a cheat sheet. Sometimes these tests are truly open notes, and students are allowed to refer to whatever they like. Sometimes students can even bring their laptops, and make use of the entire internet. 
Trying to evaluate a student’s skill at solving problems without restrictions is good. But tests aren’t a good way to evaluate this kind of knowledge because they unnaturally restrict the student in other ways. The student isn’t given the kind of time they would have if they were solving a real problem. They don’t get any choice of what problem to work on. They can’t collaborate with others, or go to peers to discuss some aspect of the problem that’s troubling them, something that is a huge part of solving problems in the real world. The format of a test hamstrings them.
This is especially tragic because tests are so naturally suited to evaluating the knowledge and skills that a student has internalized. Why not use the tests to see if the things you want them to carry around in their heads have actually ended up there?
When designing a test like this, you should figure out what you want your students to walk away with, and only include questions about those facts and skills. Anything that they would be better off just looking up (dates, exact values, trivia, etc.) shouldn’t go on the test.
A simple way to evaluate this kind of test is to give it to your peers and to other experts, and make sure that they can easily answer all the questions without looking up the answers. If experts in the field can’t casually ace your test, then it isn’t a good test of what experts should be expected to carry around in their heads.
This standard may even be slightly too harsh; you probably don’t need your students to walk out of the class on the same level as an expert. Another way to figure out if a test like this is fair is to pick a student who you know reasonably well and seems to have mastered the subject, and see how they do on your test.
A test made on these principles should be simple and easy, something that an expert would be able to breeze through.
Projects & Papers
Depending on the subject, class projects or papers are the right way to test the other skill of what you can accomplish. Rather than shoehorning open notes into a test format, which doesn’t suit it, just have them do a project.
Projects are inherently open-notes; who ever heard of limiting the resources that can be brought to bear on a class project?
No course can really be like the real world, but giving students a facsimilie is a good idea. Projects provide a better environment for this because they don’t unrealistically hamper the student, as even the most liberal open-notes test will. Students have some level of control over what project they choose, how they approach it, what techniques they use, and who they call on for help.
Is this true for all classes? I don’t think so. Foreign language courses are all about internalization. If you need to look anything up, you haven’t really learned the language. Testing makes a lot of sense in a language course, and I’m not sure if there’s any place for projects, at least not at introductory levels. Once you get to a composition course in a foreign language, projects start making more sense again.
There may be other reasons to have students do projects. In this essay I’m talking about projects being used as a form of evaluation, but projects can also be an important teaching tool. Having students complete a project as an alternative to readings or lecture is a good idea, but a different use case.
There are also probably some subjects where tests make no sense at all. For many hands-on skills, like writing or sculpture, you could conceivably make a test, but the real proof will be in creation.
Testing is a good way to examine internalized knowledge, but there are some kinds of internalized knowledge that aren’t easily measured by a test. Just how to hold your hammer and chisel, just what the dough looks like when it’s ready — these are things that an expert will have internalized, but which would be difficult to put on a test. So there are some kinds of internalized knowledge that are better measured by projects.
It seems like this is especially true for crafts, and for courses beyond the beginner level, as the student begins to pick up these hard-to-measure intutions.
Generally, the more advanced the course, the less of a role there is for testing. While every subject has a core base of knowledge that all experts will know by heart, specialists will internalize knowledge that sets them apart even from other specialists. People already seem to understand this at some level, and most advanced courses go light on the tests.
 Sky Zhang points out that in certain cases, formula sheets can make a lot of sense. A programmer may not remember the syntax for all the basic operations of the language they’re learning, and the professor shouldn’t care. Giving them a sheet that provides that syntax won’t help them if they don’t understand the concepts, but it is forgiving towards students who have deep conceptual understanding but can’t be bothered to remember the exact notation for every operation. We can trust that if they choose to continue, they will eventually know the basics by heart. I think this is another case where professors should think about what they really want students to get out of the course (in this case, the concepts) and what they could care less about (hopefully, the syntax).