Mark Guzdial's Amazon Blog

 
« Go to Mark Guzdial's full Amazon Blog

What makes programming so hard?

5:53 AM PDT, October 18, 2007, updated at 8:08 AM PDT, October 18, 2007
One of most persistent questions in computing education is the reason for the 20% Rule.  In every introduction to programming course, 20% of the students just get it effortlessly -- you could lock them in a dimly lit closet with a reference manual, and they'd still figure out how to program. 20% of the class never seems to get it.  Why is that? 

In this blog previously, I've dismissed the notion of a "geek gene."  It can't be that it's nature -- there must be some experience or reflections that those who "get it" have had, and those that don't have not had.  A parallel question to why there's a difference is why is it so hard in the first place.  What makes programming different from our everyday life experiences?  (Presuming that 100% of every class has made it at least into their teens, they're pretty successful at everyday life.)

My advisor, Elliot Soloway, addressed that question with several of his students.  Jeffrey Bonar claimed that it was the interaction between natural language and programming languages -- there were assumptions from natural language that are false in programming languages.  Jim Spohrer came up with the pithy phrase that still gets mentioned at computing education research conferences -- "It's composition, not decomposition."  People can figure out the pieces of the solution that they need from the programming language (decomposition).  It's assembling those pieces into a working program (composition) that's so hard.

Recently, a group of researchers (including Beth Simon and Gary Lewandowski) have been conducting "commonsense computing" experiments.  Take someone who knows nothing about programming, and have them explain how to do some algorithmic task, like sorting.  The question they're asking is, "Can people come up with algorithms without knowing about programming?  Is the challenge of programming the definition of the algorithm?"  The paper that they presented at ACM ICER 2007 was their best yet.  They asked people to solve a concurrency problem -- two ticket booths want to sell out a theater without ever selling the same seat twice.  The vast majority of people came up with perfectly workable solutions.  The solutions weren't necessarily the most efficient -- that proves that computer science really has something to teach people about algorithms.  The paper did convince me that people can invent workable algorithms.

So, it's not the algorithm.  Maybe it's the language -- maybe Bonar was right.  John Pane asked that question in his dissertation work at CMU with Brad Myer.  He showed people parts of a videogame, then asked them to tell him what they thought that someone would tell a computer to do to make that part of the videogame.  Here comes the cool part: John then built a programming system where people could specify the videogame the way that his subjects described it to him!  John could then be sure that his programming system matched the language that people wanted to use.  The end result was disappointing.  While John learned alot about what language features people wanted, his subjects still couldn't program much better than a traditional CS1.  I think Pane answered Bonar -- the language may play a role, but it's not the main thing.

Where does that leave us?  Here are two hypotheses still outstanding about what makes programming so hard.

The first is that it's the puzzle nature of programming.  We've all seen puzzles like "Here are 9 dots -- cross all of them with four lines" or "Here's a map with one-ways on it -- get from point A to point B using only two left turns and three right turns."  We all know how to make lines and make turns.  It's the puzzle of matching exactly those pieces in just the right way to make it all work.  This is essentially Spohrer's claim -- we know the pieces, it's about putting them together that's hard.

At our CS Ed Research seminar last week, Brian Dorn suggested another one. It's the specificity of a program, the need for exactness when our natural world allows for ambiguity. Natural language did not evolve to specify video games or algorithms.  Natural language evolved to allow interaction between thinking beings.  Programming languages are about specifying a process to a machine.  Dorn's hypothesis would say that the problem of Commonsense Computing Researchers and John Pane is that what we describe in natural language is necessarily ambiguous, and the process of getting it exactly right for the machine is the hard part.  The example that Mike Hewner in our group provided was that he bets that the Commonsense Computing subjects did not get right the number of iterations necessary in a sort to make sure that the list is truly sorted -- no more, no less.

This is one of the biggest challenges in computing education research, with big impacts.  If we knew why programming was so hard, maybe we'd also have some insight into why some programmers are two or more magnitudes better than others (claim made in Fred Brooks' Mythical Man Month and supported by others since).  Solving both of these problems would be a huge advance for our research community and would have direct impact on education and practice in computing.

[Add comment]
Post a comment
To insert a product link use the format: [[ASIN:ASIN product-title]] (What's this?)
Prompts for sign-in
 


Sort: Oldest first | Newest first
Showing 1-20 of 23 posts in this discussion
Initial post: Oct 18, 2007 8:21 AM PDT
In my experience, the most challenging thing about programming is the need to switch my attention up and down the abstraction stack. Make sure the user interaction makes sense; make sure the OO design contracts are fulfilled; make sure the bytes flow smoothly at a low level; all at the same time.

Programmers who can't do this well have real trouble. I have trouble when I can't do this, either because of lack of knowledge or fatigue. And much of the art of software architecture is minimizing the number of levels that need to be tracked; that's one reason (for instance) programming languages without GC are on the decline.

I guess this counts as a vote for Spohrer. It's composition.

Posted on Oct 18, 2007 9:02 AM PDT
For those that want to see the fat lady sing: http://www.cc.gatech.edu/conferences/icer2007/slides/lewandowski-talk.pdf

Posted on Oct 18, 2007 9:22 AM PDT
Last edited by the author on Oct 18, 2007 9:25 AM PDT
 Jonathan Cole says:
First let me say great article. I've often reflected on this issue.

My experience is that clients have genuine difficulty thinking through all the ramifications of the business practice they want supported in software. Often, clients assume complex rules are required for a complex system. It's necessary to get analysts in early to avoid inconsistencies that are hard or impossible to encode in software.

Also, note that every effort to put the power in the hands of the end user has failed, or been less than successful. SQL and 4GL have not relieved the necessity of a programmer. I can only surmise that it's because it's the puzzle and the algorithm, not the language. People just don't develop the skill of logical thought. And that's ok, because there's lots of areas to be skilled in, and I get a profession that I love.

Posted on Oct 18, 2007 9:24 AM PDT
Last edited by the author on Oct 18, 2007 9:31 AM PDT
 Michiel trimpe says:
The answer is visualization of course. How easily can you visualize the structure of a program? How precisely can you visualize a program? How many interaction can you visualize? etc. etc.

If you can only hold a bit of the code (a few classes) in your head at the time you're not that good of a programmer. If you can only hold objects and cannot visualize functions, then you're never going to be a functional programmer.

That's also why people could do the ticket booths problem: they could visualize it!

If you can visualize functions and objects, if you can easily explore leaky abstractions and visualize a large part of the project in your brain simultaneously, you are in the top 20%. If you can visualize the whole damn system in any possible way from top to bottom then you're going to be one of the awesome programmer, because:
1) programming becomes typing out the program you have in your mind.
2) visualizing the whole system lets you see whether the 'big picture' is beautiful ... or ugly.

Posted on Oct 18, 2007 9:34 AM PDT
 Brock M. Cusick says:
I think there's a mental switch that has to be made between "telling the computer what to do" and "doing something." When we're in the mental frame of telling the computer what to do we're tempted to use the same shortcuts as when we speak to others ("Drive to Costco"). However, when I tell you to "Drive to Costco" you understand that the first step is "Reach into pocket, remove keys." etc. etc. It is about specificity, but I think the problem CS1 students have is that most of them think of the computer a (dumb) person receiving written instructions, while the 20% who "get it" straight off realize it's just a waldo through which they manipulate code directly.

Posted on Oct 18, 2007 10:31 AM PDT
 Brian Lalonde says:
[Customers don't think this post adds to the discussion. Show all unhelpful posts.]

In reply to an earlier post on Oct 18, 2007 11:20 AM PDT
Last edited by the author on Oct 18, 2007 11:24 AM PDT
 Aaron Denney says:
Michael trimpe: You're right that people need to be able to keep lots of things in their head at one time, and manipulate them effectively. But you're suggesting that "visualization" is the only way to do that, which is not at all true.

Posted on Oct 18, 2007 11:36 AM PDT
Thanks for a good and thoughtful post.

I believe that Brian Dorn is on the right track -- and if, as the commonsense computing experiments suggest, algorithms are not intrinsically the problem, it seems likely that the ability to organize a large number of precise details probably is. Some of this probably goes back to our earliest mental models, as illustrated by Seymour Papert's work with children and computers -- but I also think there is some natural variation both in capacity and patience for the intricate reasoning programming requires. And if it really does respond to natural variation, the bell-shaped distribution suggested by the 20% rule might not be so surprising.

That said, I agree with you that the problem isn't the presence or absence of a "geek gene." However, it seems to me (at least anecdotally) that the best students in my courses are those with good short-term memory -- able to keep more balls in the air, and for longer, than their colleagues. With practise and well-chosen mnemonics, most students seem to be able to keep up well enough; but there are always some who just can't do it, and get extremely frustrated trying. Can this be overcome by discipline alone? I'm not so sure. Obviously, we should keep trying, but that bottom 20% is a pretty tough nut to crack.

Of course, we've always known details are a problem -- hence all the linguistic and functional abstractions we put between them and ourselves. But it's a wicked problem, because our abstractions, too, are details; fewer in number, but larger in size. In the words of Piet Hein: "The road to wisdom? Well, it's plain, and simple to express: Err and err and err again -- but less and less and less."

Simple it is, perhaps -- but not easy!

Posted on Oct 18, 2007 12:21 PM PDT
"Programming is hard. Taking care of a baby is hard."

Reality is that both are difficult things to do. But nature make it so that more people enjoy trying the later than the former.

-- JeanHuguesRobert

Posted on Oct 18, 2007 1:28 PM PDT
 DLK says:
The ability to visualize and manage abstractions are key, as mentioned by Jeffrey S. Moore and Michiel Trimpe. A third thing that isn't mentioned but is just as important is the ability to plan out how the various parts are going to fit together and work together. It's analogous to the ability to plan optimally your route when doing your Saturday morning errands.

I think these programming skills are similar to the skills used by a good novelist when writing a novel; if the proper plot elements are laid out in the proper sequence and the characters are sufficiently developed, then the story falls flat. And if the QA, the editor, is sub-par then the story will fail in the marketplace.

Posted on Oct 18, 2007 2:14 PM PDT
Last edited by the author on Oct 18, 2007 2:18 PM PDT
 Thomas L. Biggs says:
This article changed my thinking forever:

http://www.reciprocality.org/Reciprocality/r0/Day1.html

I'd always noticed that many people seemed to reason about things differently
than I did, and that I didn't do well in a school environment. I've also known
a minority of people that just seemed to "get it" on many subjects - the more
contemplative and insightful people I know - Mappers.

In reply to an earlier post on Oct 18, 2007 3:25 PM PDT
 K. D. Wampler says:
What I find interesting about your Novelist analogy is that I have known so many musicians over the years who were good programmers. Musicians have to master theory, yet they express it with a high sense of esthetics. Good, programming is very similar, IMHO.

On the point of being able to keep a lot of details in your head, Jazz musicians do this when they improvise. They weave together lots of little "idioms" and then they eventually pull it all together at the end.

Posted on Oct 19, 2007 6:11 AM PDT
 Travis Corcoran says:
re: "It's the specificity of a program, the need for exactness when our natural world allows for ambiguity. Natural language did not evolve to specify video games or algorithms. "

Agreed...but I think the word "evolve" is key here.

Natural language didn't evolve for that purpose, but neither did human brains.

This sort of subsumes the natural language argument.

Imagine a task that humans did evolve for - say, finding food.

No human says "I'm going to walk 200 yards that way, listen for a deer, and if I find one, stalk it, but if not, then I'll move clockwise around my origin point in 30 foot intervals, pausing each time for 60 seconds..."

A human may have a vague plan to head THAT way for 200 yards, and end up going 600 yards...and then finding a recently killed moose, so he goes back, gets some other people, returns, chases the predators away from the corpse, and digs in. Or maybe he finds some berries. Etc.

The point being: humans didn't evolve to (a) think in crisp terms; (b) preplan all their actions.

Actually, I started off with 'a' as my primary point, but I'm coming to like 'b' better - humans carry their brains around with them, and can adapt to the situation on the ground. Writing code means trying to imagine all eventualities ahead of time, and precomputing solutions (or, at least, approaches) to all of them.

This is fundamentally not something that we evolved for.

The fact that we can do it at all is fascinating.

TJIC_amazon@heavyink.com

Posted on Oct 19, 2007 4:24 PM PDT
Makes make people good at math, physics and chemistry while they suck at learning new languages and struggle with learning many blocks of seemingly unrelated things (that was biology class for me, but my teacher promised me, that the logical connections exist and that I could find that out by myself if I am willing to study 10 semesters of biology after high school.. eh.. no). Other people are good at art and abstract, ambiguous stuff, but suck at match or physics. Anybody can manage to pass any class with an A if you spend the time memorizing everything that you can cite everything you learned in an instant. That will not make you "get it" or good at it outside of the pre-set and limited environment of school classes and the tests that go along with them. It is extremely rare that somebody is a naturally talented artist and a good mathematician at the same time.

Math has to do with connections from point a to point z without logical gaps. You don't have to know the b or c necessarily if you know everything that about what causes it. You can break down the most complicated math and physics down to a very small set of fundamental things you have to know without a way around it. From those few things can you extrapolate the more complex things, one step at a time. In physics for example are you learning a number of formulas and the application and use of them is also checked in tests. If you don't remember the specific formulas and would have enough time to determine the formula from simpler and more basic formulas you could answer the test question properly. You don't have (get provided) the time for that in almost any case, what makes you loose points for those questions. You don't get a poorer grade for being unable to solve the problem, but for not having memorized the details of a more complex thing to apply it in a very short period of time. That does not mean that you did not understand it.

Programming is exactly that. You have a very limited set of basic parameters, which you have to know. After that is it only one logical connection after another to get to the more complex and difficult end result. Using bad examples for the basic parameters makes stuff only a bit harder, but does not change the general principles of how they are applied. The person with the proper understanding will take longer and get some headaches from using ill chosen terms for the basic parameters, but that's about it.

I give you an example for crappy parameters. MS got the splendid idea once to provide a translated version of their Visual Basic for Applications script language with non-English versions of their MS Office products. The German version had suddenly commands, methods, properties, functions etc. translated from English to German, which the developer had to use instead of the original English terms for it. Now German grammar is a bit different than the more simple English so thinking through the local connections using the German terms sounded in your head like somebody is talking in a very very bad German. This is not the case (as much) in English. For a =1 to 10 step 1 ... next makes somewhat decent English phrases and partial sentences. Fuer a = 1 bis 10 schritt 1 .... Naechste sounds bad word for word.

Why some people are able to break larger and more complicated things up into smaller pieces that are connected to each other and others are able to keep blocks of unrelated stuff together and make it a piece that triggers emotions rather than thoughts, without the ability to break those things down into small basic pieces, I don't know. The latter example is art. Lets take computer art as an example. A programmer brain tries to put pixels together that they make sense and the outcome of that is usually not that pretty. An artist on the other hand can not explain why he set a pixel there and not somewhere else and why he picked that color. The choice was done without logic behind it, but with a overall "knowing" that the pixel has to be that way to make the whole piece look great.

I believe that the tendency for being more the math guy versus the artist is rooted in genetics, but I also believe that the most people have booth, only with one more dominant as the other. With the proper training during early childhood can you make a good artist out of somebody who leans more towards the logical end of it. But I also think that you will never become great at it, unless you have the right tendency for the thing in your genes. In the rare cases where somebody is gifted with both could it be the case that non of the two tendencies is really dominant and can be both nurtured.

That's my take on this. I hope that makes sense.
Cheers!
Carsten

In reply to an earlier post on Oct 20, 2007 11:14 AM PDT
 Mark Guzdial says:
Holy Cow! This teaches me to post to my blog while on a trip!

Thank you all for the interest and the comments! I'm not going to respond to these now (while sitting in an airport in Portland). I will try to hit some of these ideas in my next few posts. In fact, I decided to write the one on visualization while in the MAX (great mass transit in Portland!) over to the airport, based on what I saw skimming these posts.

Posted on Oct 23, 2007 2:04 PM PDT
 SunByrne says:
While I think a lot of the previous posters have raised interesting points, I think lots of them have mis-identified the problem. The problem isn't "what separates good programmers from great programmers?" Maybe that's visualization or something else, but my impression of Mark's original question has to do with the BOTTOM fifth: "20% of the class never seems to get it. Why is that?" Worrying about visualizing large programs only makes sense if you can master the basics of having something to visualize in the first place. For some people that's really hard.

I don't think most of the offered explanations touch on this; these all sound like explanations by people who "get it" about variation between levels of getting it. Things like "moving up and down the abstraction stack" fall into this category. And I don't think it's working memory capacity or mathematical ability, either--I've known people with both who simply cannot quite fathom programming. Nor do I think it's like novel writing or the ability to generate optimal plans. I've spent hours working through the simplest programming constructs with people who I know are bright by any objective measure, but they just don't "get it" at a very basic level. This is not the difference between the ability to generate crappy programs and good programs, it's the difference between generating crappy programs vs. NO programs at all. People for whom generating something the compiler will even take at all is a challenge.

The real difficulty in teaching these people is that it's nearly impossible for people with the ability to write even passably good programs to diagnose the problem. I've been programming since I was what, 11? (Hurray Commodore Pet with its 4K of RAM!) The basic (err, BASIC) ideas made sense to me then. It's not that I was smarter than the other kids or the adults who don't "get" it; it's that the people who don't "get" it seem to lack some fundamental capacity which those of us who do get it take for granted. (And, for the record, I'm sure there are other domains where I'm one of the ones who doesn't get it.)

The question as I see it is, "What is that capacity?" And I think the answer is that whatever it is, it's not simple, or CS educators like Mark would have figured it out a long time ago. The comments about how giving instructions to a computer is not like giving instructions to a person strike the right chord, but why is it that some of us adapt so readily to giving instructions to computers and some people find it impenetrable? It's not as simple as IQ or memory capacity; that would be obvious by now. Maybe something about being able to simultaneously represent things abstractly but also in very small bits with strong, concrete sequential order--I'm not sure. The problem is that I can't think about programming like a non-programmer.

I wish Mark and the rest of the CS Ed community the best of luck in figuring out whatever it is, and I fervently hope it's something which can be taught. At an earlier point in my life I would have strongly believed that it must be, but I'm not so sure.

In reply to an earlier post on Oct 27, 2007 7:03 AM PDT
 Kevin Douglas says:
Most if not all of these points can be learned if with nothing more than experience. Its my opinion that programming isn't the hard part; the hard part is problem solving. Give me someone that's good at that and I'd be happy to teach them a programming language or two.

To me that's the big difference between people that "get it" and those that don't. The ones that get it can see the problem and come up with a few ways to solve it. There's a good chance they have it coded before they sit down to a computer. The ones that don't get it just wait for the solution and then go type it in to a text editor.

Posted on Oct 30, 2007 2:55 PM PDT
 Mark Miller says:
What I'd say is hard about learning to program is understanding how the computing model works, because that's what you're really dealing with--how the computer "thinks". What I remember about my learning experience with BASIC, my first programming language, is it was hard to translate what I wanted to do into something the interpreter would understand. Also, the language had inconsistent semantics.

One example I can think of is that in certain circumstances BASIC allowed you to use variables without declaring them first. If you wanted to say: A$="Hello world", you could do just that. However if you wanted to have an array, you could not say A$(1,11)="Hello world" (the BASIC I used did not have string arrays, only character arrays). You had to DIM it first: DIM A$(11). Also, you could not recast the type of a variable. If you said earlier A$="Hello world", you could not say later DIM A$(11). This really confused me. The idea of declaring variables felt foreign. Also the idea of types was confusing. I could get the idea of variables, but the idea of types was difficult to understand.

I also remember thinking that the computer could understand my intent, based on the data I was giving it. I was trying to get the computer to display something, and then wait for the user to hit the Return key. So I told it what to display and then said: Print "Press Return to continue". That was it. I couldn't understand why the computer didn't get that when I said "Press Return to continue", that it wouldn't key in on that and understand that I meant "I should wait for the user to hit this key". Someone had to explain to me, "You're telling it what to display to the user, but nothing else. You have to use the 'Input' command to tell it to wait." I understood the "Input" command, but only for getting a quantity of some sort from the user, like "Enter a number". I didn't understand the deeper mechanics, that what "Input" really does is: pause the program, wait for input, assign that input to a variable. And what I could do was just use the "pause the program, and wait for input" part of that process to my advantage, with the "assign input to variable part" just being a throw-away.

"Input" was NOT a part of a grammatical statement. It doesn't MEAN anything in the sense that if you use it one context it means one thing, and if you use it in a different one it means something else. It was merely a metaphor for a PROCESS. Once I understood that, then I could think of a language as a library of processes, and I could think about the process I had in mind and decompose it into the processes the language offered, and use the metaphors to create the program.

So I think it's a problem of composition AND decomposition. Composition: Understanding certain ideas like scope, order of operations, and breaking up a program into functional units (procedures/functions/methods). Decomposition: Understanding what's going on beneath the language you're using, and how you can translate the process you see in your head to the processes the computing system knows how to do, and understanding what mneumonics go with which processes. Putting these two skills together is a challenge.

Whenever I've learned a new language, I've always found it very helpful to understand what's going on beneath the language I can see. I don't need to see everything, but I need to understand something besides syntax and the way a program is structured.

In reply to an earlier post on Oct 30, 2007 3:52 PM PDT
 K. K. Lamberty says:
In talking with a colleague about teaching introductory CS courses, I was struck by how he captured it - pattern matching. For at least some of the programming in early courses, you have to get how to manipulate data and how to call functions. You can get through it pretty well if you are good at pattern matching.

Some people in my intro courses really just weren't seeing the places where they needed to make changes to go from an example to something else. It wasn't that they didn't know what the program needed to do. It was that they could not seem to look at an example and pick out the parts that were doing work, and make them do something different.

Maybe there would be a way to teach people about pattern matching that would help them learn about how programming works. Not to make them _better_ programmers, but to make them _able_ to program. Then, once they can do that, maybe wen can help them understand more about what it is they are actually doing so that they can get better at it. I'm not sure.

Posted on Nov 22, 2008 3:09 PM PST
 rvasa says:
Another analogy to consider for any reasonable sized system -- software development is akin to translating a book from one language to another, say English to German. Only the English book is being written, story arcs adjusted and re-written while you are attempting the translation to German. Even if we could get over the bump of solving small puzzles and simple problems, the discipline needed to build in a team adds yet another layer of complexity.
‹ Previous 1 2 Next ›
 
RSS Feed for Mark Guzdial     

Bio

I started teaching computing in February 1980. I was 17 in my senior year of high school, and I taught "Bits, Bytes, and Basic" in a community education class. I taught through my undergrad years--community education, afterschool classes, GED classes, and even community college in 1984. I read "Personal Dynamic Media" by Adele Goldberg and Alan Kay while on an internship at Bell Labs in 1982. I'd never before thought about computing FOR learning (as opposed to learning ABOUT computing). Adele and Alan's thoughts and words set me on the road to my PhD in Education and Computer Science at the University of Michigan in 1993. Nowadays, I focus on using lessons from learning sciences and educational technology for teaching about computing.



Where's My Stuff?

Shipping & Returns

Need Help?

Conditions of Use | Privacy Notice © 1996-2009, Amazon.com, Inc. or its affiliates