This is one of the fundamental lessons of SICP and one of the most powerful ideas of computer science. It works like this:
What we think of as "code" doesn't actually have the power to do anything by itself. Code defines a program only within a context of interpretation -- outside of that context, it is just a stream of characters. (Really a stream of bits, which is really a stream of electrical impulses. But let's keep it simple.) The meaning of code is defined by the system within which you run it -- and this system just treats your code as data that tells it what you wanted to do. C source code is interpreted by a C compiler as data describing an object file you want it to create. An object file is treated by the loader as data describing some machine instructions you want to queue up for execution. Machine instructions are interpreted by the CPU as data defining the sequence of state transitions it should undergo.
Interpreted languages often contain mechanisms for treating data as code, which means you can pass code into a function in some form and then execute it -- or even generate code at run time:
#!/usr/bin/perl
# Note that the above line explicitly defines the interpretive context for the
# rest of this file. Without the context of a Perl interpreter, this script
# doesn't do anything.
sub foo {
my ($expression) = @_;
# $expression is just a string that happens to be valid Perl
print "$expression = " . eval("$expression") . "\n";
}
foo("1 + 1 + 2 + 3 + 5 + 8"); # sum of first six Fibonacci numbers
foo(join(' + ', map { $_ * $_ } (1..10))); # sum of first ten squares
Some languages like scheme have a concept of "first-class functions", which means that you can treat a function as data and pass it around without evaluating it until you really want to.
The upshot is that the division between "code" and "data" is pretty much arbitrary, a function of perspective only. The lower the level of abstraction, the "smarter" the code has to be: it has to contain more information about how it should be executed. On the other hand, the more information the interpreter supplies, the more dumb the code can be, until it starts to look like data with no smarts at all.
One of the most powerful ways to write code is as a simple description of what you need: Data which will be turned into code describing how to get you what you need by the interpretive context. We call this "declarative programming".
For a concrete example, consider HTML. HTML does not describe a Turing-complete programming language. It is merely structured data. Its structure contains some smarts that let it control the behavior of its interpretive context -- but not a lot of smarts. On the other hand, it contains more smarts than the paragraphs of text that appear on an average web page: Those are pretty dumb data.