Conversation Starter: Teaching Philosophy in an Age of Large Language Models (guest post)




Over the past few years we have seen some startling progress from Large Language Models (LLMs) like GPT-3, and some of those paying attention to these developments, such as philosopher John Symons (University of Kansas), believe that they pose an imminent threat to teaching and learning (for those who missed its inclusion in the Heap of Links earlier this summer, you can read Professor Symons’ thoughts on this here). In the following guest post, Benjamin Mitchell-Yellin (Sam Houston State University) responds to Professor Symons, offering his own view of the dangers LLMs pose as well as strategies teachers could employ to minimize them.

[“An ancient Egyptian painting depicting an argument over whose turn it is to take out the trash” made with DALL-E 2 by LapineDeLaTerre]

Conversation-Starter: Teaching Philosophy in an Age of Large Language Models
by Benjamin Mitchell-Yellin

It’s once again the time of year when those of us who teach philosophy are thinking about how to structure and deliver our courses. If you’re anything like me, your courses are writing-intensive; your course objectives include helping students improve their ability to think critically, analyze and construct arguments, and express all of this in writing; and you’re hoping that the tweaks and revisions you’re making to your syllabus will improve your own ability to make good on the promise of what a philosophy course has to offer.

I was in the thick of planning my fall courses, when I got a bit of a shock. In “Conversation-Stopper,” John Symons (Kansas) argues that large language modules (LLMs), like the relatively new and much-discussed GPT-3, “will change the traditional relationship between writing and thinking.” As the subtitle puts it, they’ll make us “less intelligent.” What’s more, Symons claims, “The LLM marks the end for standard writing-intensive college courses.” He expects us to see the effects clearly as early as this October. (As if we needed another reason to dread midterms!)

My heart skipped a beat. Hasn’t Silicon Valley done enough damage already?!

Then I thought carefully about what Symons was arguing and felt a lot better.

Thinking through why Symons’ pronouncement about the death of writing-intensive courses is premature has been a healthy exercise. For one thing, it has made me consider the structure of my courses and assignments in a way I haven’t done since early in my teaching career. For another, it has helped me to see that the worries he raises, and potential solutions to them, apply to a host of issues. My main contention will be that, despite appearances, it’s not really about LLMs, after all.

But let’s begin with a look at Symons’ argument. The heart of his case appears to turn on the novelty of what we’re now facing:

Students have long been tempted by services that write essays for them and plagiarism is a constant and annoying feature of undergraduate teaching, but this is different. The LLM marks the end for standard writing-intensive college courses. The use of an LLM has the potential to disconnect students from the traditional process of writing and research in ways that will inevitably reshape their thinking. At the very least, these tools will require us to reconsider the mechanics of writing-intensive courses. How should we proceed? Should we concentrate on handwritten in-class assignments? Should we design more sophisticated writing projects? Multiple drafts? 

(I’ll have something to say about why it’s unfortunate the concluding questions are merely rhetorical. But first, I want to explain what I think it is that really drives Symons’ argument. Teaser: it’s fueled by the collapsing of an important distinction. Read on, my friends.)

To his credit, Symons doesn’t shy away from considering the implications of what he’s arguing. He goes on to consider whether we should embrace the disruption and “realign our attitudes to writing,” but demurs, in part, because of his experiences, like those of many of us, teaching during the pandemic. Technological tools, such as Zoom, became the norm because they allowed for socially distant instruction—to the detriment of both instruction and society, it would seem. Symons concludes that the arrival of LLMs means that “it’s necessary for faculty to change the way we evaluate student written work in our courses and more importantly, to rethink the role of writing in education. … In the age of the LLM we will not be able to rely on written exercises to make the work of thinking happen. We will also find that writing skills which previously served as reliable signs of the virtues we associate with thinking can no longer do so.”

Let the disruption reign!

I don’t mean to be flip. I think Symons has called our attention to something important. It’s just not what he thinks it is.

I want to make it clear that I agree with much of what Symons says. For example, I agree that “writing can be an aid to thinking.” I also agree with him that it’s likely some students will pass off AI-authored essays as their own work (some likely already have). And I identify with his comment that “the advent of LLMs puts me right back in the position of being a novice teacher.” What I disagree with is Symons’ pessimism about the prospects of writing-intensive courses, and this for reasons having to do with that distinction I promised you.

There’s a difference between teaching someone to think and write and assessing what they’ve written as a measure of the quality of their thinking. It’s not always obvious how the two come apart, and they’re not entirely alien from each other. For example, effective teaching often involves assessment. But some quick examples should be enough to show that simply grading completed essays is not a reliable measure of student learning.

Suppose you have two students, Jack and Jill, and they each turn in essays that receive the same grade. Assume, as well, that there’s no cheating or bias or anything like that. They write the essays to the best of their ability; you grade them fairly. Even though they’ve turned in work of the same quality, you shouldn’t take this to demonstrate you’ve taught them anything, let alone the same things.

Consider a first scenario in which both students turn in A papers. But Jack came into your class with little to no experience writing this sort of paper, nor did he have much in the way of background knowledge about philosophy or arguments. He earned his A by paying attention and putting in a lot of work for your course. Jill, by contrast, had taken lots of philosophy courses before, received lots of writing instruction outside of your course, and basically phoned it in on this assignment. It seems safe to say that you taught Jack a lot and Jill nothing.

In a second scenario, both Jack and Jill earn Ds on their papers. Again, Jack came into your class with little to no relevant background; Jill came in with lots of prior coursework in philosophy. It’s entirely possible in this scenario that, once again, you taught Jack a great deal and Jill nada.

What these scenarios demonstrate is that when assessment is used to take a snapshot of a student’s skills and knowledge, it doesn’t tell you whether there has been growth. To do that, you need more than time-slice data.

And that brings us to October or, more precisely, midterms. It’s fairly typical in a writing-intensive course to assign more than one paper. The feedback (hopefully more than just a letter grade) on the midterm is formative in nature. Students can use it to improve on the final. And if we find ourselves in a third scenario, where Jack goes from a midterm C to a final A and Jill goes from a midterm A- to a final A, perhaps we can be reasonably confident that you’ve taught Jack something and Jill next to nothing. Sometimes assessment can help us to identify learning.

I’ll bet you’ve already sniffed out the ways that LLMs (or paper mills or cutting and pasting from the internet or whatever) could disrupt this. In a fourth scenario, in which one of our initial assumptions doesn’t hold, Jack earns a C on his midterm and then turns to an LLM to “write” his final essay, which receives a B+. An improved assessment outcome, in this context, isn’t a reliable sign of learning.

So far, we’ve seen that when writing is used as a mere means of assessment, it doesn’t help us much to track whether our students have learned anything in our courses. For this same reason, LLMs really do seem to pose a threat to writing-intensive instruction as it is sometimes practiced. This new form of cheating exploits a lack of familiarity with the thinking that a student’s writing is supposed to evince.

But LLMs shouldn’t concern those who use writing (also) as a tool for teaching the sorts of skills that are typical of philosophy course objectives, such as argument articulation and analysis. There are a host of familiar teaching strategies we can employ in our classrooms to help our students learn these skills and help ourselves become familiar with their progress in doing so.

Let’s revisit Symons’ rhetorical questions. Should you, as a conscientious teacher of a writing-intensive course, require Jack and his classmates to do all of their writing by hand in class? This would eliminate the threat of using an LLM to cheat. Should you require Jack and his classmates to write multiple drafts? As many of us who already do so are aware, it’s a red flag whenever a student turns in a paper that is so radically different than any other written work you’ve seen from them before. But while multiple drafts and in-class exercises may be ways to prevent students from gaming the system, many readers are likely thinking that they’re untenable solutions. Who has class time to devote to allowing students to pen entire essays under your watchful eye (or wants to “go full surveillance” on them)? Who has time to give constructive feedback on mandatory rough drafts?

The good news is that in-class writing and multiple drafts don’t need to be time-sucks. You can assign students a short in-class writing prompt, have them put their names at the tops of their papers, collect them at the end of class, and simply use them as a means of taking attendance. Of course, you can also read what they wrote, or what some of them wrote, to find out if they were grasping the material or constructing the argument well or whatever. But none of this has to take much time, and these sorts of activities can do double-duty—attendance tracker and learning prompt rolled into one.

Who says you need to be the one to read and provide feedback on that rough draft? Have students provide feedback to each other using a rubric you’ve given them (perhaps the same one you’ll use to grade the final version). This can be done during or between class meetings, and it can be done in-person or online. Again, the benefits are multiple: there are opportunities to learn both in the giving and in the receiving of feedback. They can turn in their rough draft and peer comments along with the final version of their essay.

Sure, these may be among the best practices when it comes to effective writing instruction. But how are you going to make sure they don’t have an AI “write” their rough draft for them?

Try using templates to scaffold out your writing assignments. Instead of having students workshop rough drafts, have them workshop an initial outline formed from filling in sentence stubs on a template you’ve provided. You can even integrate templates more thoroughly into your course. Provide them with templates to use as guided notes to structure their understanding of the readings—or better, have them construct templates like this for each other. Filling out sentence stubs isn’t the same as writing an essay, but it can be one building block with which to erect a polished paper. And the beauty of templates is that they have a built-in structure. For a student like Jack, it can be enormously helpful to have repeated practice writing within the confines of a template. The essay then becomes a means of expanding on what he’s learned through doing about how to structure an argument, as opposed to an opaque request to produce a piece of writing modelled on what he’s been reading all term.

Now, I don’t pretend that teaching techniques like scaffolding, templates, and peer feedback are surefire ways to eliminate the threat LLMs appear to pose. For one thing, they won’t eliminate the possibility of someone submitting an AI-generated manuscript to a journal for review. Perhaps this is (part of) what the creators of GPT-3 had in mind when they raised the prospect of its being used for “fraudulent academic essay writing.” But I do think the wide array of available techniques for teaching writing should make all of us philosophy professors sleep a bit better at night. Perhaps we’ll need all of that rest, since we’ll have realized we can’t simply assign essays and grade them if we want to make sure our students are actually learning from us.

Symons notes that teaching occurs in “embodied and meaningful social contexts.” I’d add that effective teaching occurs in such contexts over time and involves a certain type of relationship between student and teacher. And I’d like to thank Symons for starting a conversation that helped me to think through our craft like a novice.

Originally appeared on Daily Nous Read More