One of most persistent questions in computing education is the reason for the 20% Rule. In every introduction to programming course, 20% of the students just get it effortlessly -- you could lock them in a dimly lit closet with a reference manual, and they'd still figure out how to program. 20% of the class never seems to get it. Why is that?
In this blog previously, I've dismissed the notion of a "geek gene." It can't be that it's nature -- there must be some experience or reflections that those who "get it" have had, and those that don't have not had. A parallel question to why there's a difference is why is it so hard in the first place. What makes programming different from our everyday life experiences? (Presuming that 100% of every class has made it at least into their teens, they're pretty successful at everyday life.)
My advisor, Elliot Soloway, addressed that question with several of his students. Jeffrey Bonar claimed that it was the interaction between natural language and programming languages -- there were assumptions from natural language that are false in programming languages. Jim Spohrer came up with the pithy phrase that still gets mentioned at computing education research conferences -- "It's composition, not decomposition." People can figure out the pieces of the solution that they need from the programming language (decomposition). It's assembling those pieces into a working program (composition) that's so hard.
Recently, a group of researchers (including Beth Simon and Gary Lewandowski) have been conducting "commonsense computing" experiments. Take someone who knows nothing about programming, and have them explain how to do some algorithmic task, like sorting. The question they're asking is, "Can people come up with algorithms without knowing about programming? Is the challenge of programming the definition of the algorithm?" The paper that they presented at ACM ICER 2007 was their best yet. They asked people to solve a concurrency problem -- two ticket booths want to sell out a theater without ever selling the same seat twice. The vast majority of people came up with perfectly workable solutions. The solutions weren't necessarily the most efficient -- that proves that computer science really has something to teach people about algorithms. The paper did convince me that people can invent workable algorithms.
So, it's not the algorithm. Maybe it's the language -- maybe Bonar was right. John Pane asked that question in his dissertation work at CMU with Brad Myer. He showed people parts of a videogame, then asked them to tell him what they thought that someone would tell a computer to do to make that part of the videogame. Here comes the cool part: John then built a programming system where people could specify the videogame the way that his subjects described it to him! John could then be sure that his programming system matched the language that people wanted to use. The end result was disappointing. While John learned alot about what language features people wanted, his subjects still couldn't program much better than a traditional CS1. I think Pane answered Bonar -- the language may play a role, but it's not the main thing.
Where does that leave us? Here are two hypotheses still outstanding about what makes programming so hard.
The first is that it's the puzzle nature of programming. We've all seen puzzles like "Here are 9 dots -- cross all of them with four lines" or "Here's a map with one-ways on it -- get from point A to point B using only two left turns and three right turns." We all know how to make lines and make turns. It's the puzzle of matching exactly those pieces in just the right way to make it all work. This is essentially Spohrer's claim -- we know the pieces, it's about putting them together that's hard.
At our CS Ed Research seminar last week, Brian Dorn suggested another one. It's the specificity of a program, the need for exactness when our natural world allows for ambiguity. Natural language did not evolve to specify video games or algorithms. Natural language evolved to allow interaction between thinking beings. Programming languages are about specifying a process to a machine. Dorn's hypothesis would say that the problem of Commonsense Computing Researchers and John Pane is that what we describe in natural language is necessarily ambiguous, and the process of getting it exactly right for the machine is the hard part. The example that Mike Hewner in our group provided was that he bets that the Commonsense Computing subjects did not get right the number of iterations necessary in a sort to make sure that the list is truly sorted -- no more, no less.
This is one of the biggest challenges in computing education research, with big impacts. If we knew why programming was so hard, maybe we'd also have some insight into why some programmers are two or more magnitudes better than others (claim made in Fred Brooks' Mythical Man Month and supported by others since). Solving both of these problems would be a huge advance for our research community and would have direct impact on education and practice in computing.
|
|
Bio
I started teaching computing in February 1980. I was 17 in my senior year of high school, and I taught "Bits, Bytes, and Basic" in a community education class. I taught through my undergrad years--community education, afterschool classes, GED classes, and even community college in 1984. I read "Personal Dynamic Media" by Adele Goldberg and Alan Kay while on an internship at Bell Labs in 1982. I'd never before thought about computing FOR learning (as opposed to learning ABOUT computing). Adele and Alan's thoughts and words set me on the road to my PhD in Education and Computer Science at the University of Michigan in 1993. Nowadays, I focus on using lessons from learning sciences and educational technology for teaching about computing.
|