Feed Subscriptions
Rolling Links
- The Media Laboratory, MIT
- Open Code Blog - New York Times A blog about open source technology at The New York Times, written by and primarily for developers.
- Signal vs. Noise
- Science Blogs
- Cartoons on the backs of business cards.
Note: This is a rather rough draft of this idea. Comments are welcome.
Ah, the genetic algorithm. For those of you who are unfamiliar with this idea the intuition is relatively simple. Have you ever played the game E.V.O.? The game nicely explains the process of evolution with a creation myth. Sol (God, personified as the sun) has a daughter Gaia (the earth) who is the creator of all life. There is a competition among life to evolve (to be better, able to defeat other life forms), with the best life form eventually reaching Eden, where they are destined to become Gaia’s partner.
The game essentially presents evolution as a method of selection, from Gaia’s perspective answering ‘which one of these squishy things should I marry.’ Although we can question the lack of god like creatures in this world that has reduced Gaia to her search, she is using the process of evolution to determine who the best possible person would be to end up with (the conceit of the game is that you are the same entity throughout evolution). In genetic algorithms, we use the idea of evolution to solve problems, hard problems, where we couldn’t know the answer (or it would be very hard to know). They are a search problem.
Take design for example. Knowing what the best (or even a good) design will be is difficult. Part of the problem is what we mean by good. Do we mean usable? Aesthetically pleasing? Gets the most adsense revenue? The way we define good is the fitness of a particular design. In life (as in E.V.O.) a good life form is simply one that can continue to live and, ultimately, reproduce. We are concerned about the generation more than the individual. To live a life form needs to be successful at getting nutrients (eating and breathing) and in reproducing to continue its genetic code. In the real world a plant that is incredibly beautiful, but extremely difficult to reproduce, fails the real test of fitness.
In the computer world we can be a little more flexible about what we mean because we can change the rules. Let’s imagine that plants could somehow gain nutrition from humans looking at them. If that was true the most beautiful plant would become the most fit. It would continue to grow and reproduce because it was the most attractive. Human attention is an interesting and fickle thing. We could imagine that as that plants reproduces it becomes more and more populous. Humans get used to it and look at it less. Eventually, even though it was once thought beautiful and attractive, it comes to be considered common and ugly. The plant dies out. Another plant that has evolved with a similar nutrient mechanism (feeds on attention) but limits its reproduction could be more viable. It will always remain scarce and never meet the fate of its cousin.
The point of this example is to demonstrate that fitness conditions are variable. What is fit today may not be fit tomorrow. Its convenient for an organism that has little fur when it lives in the tropics, but dangerously unfit when an ice age approaches. The freak organisms that were born with larger amounts of fur may be able to survive. The mutation allows the species to continue in the ice age, but is different than the original creature. This is the idea of adaptability. Mutations, random changes to the genetic code, may prove to be useful as the conditions of fitness change.
Sketching a Genetic DIV
Ok, ok. Enough generalizing. Let’s talk web world. What’s genetic evolution like on the web? We’re going to present genetic algorithms as a solution to design problems on the web. In this process we’ll try to draw paralells to the general examples above, where it is useful, and ignore the real world when the comparisons aren’t that useful. The goal here will be genetically design the web to be good. We’ll have to define what good means a little later on. Why do this? It might be valuable (we’ll have to see) but its also a decent way to introduce the idea and implementation of genetic and pseudo-genetic algorithms to a webby audience.
Primitives
Back to the real world comparison. What’s our basic primitive, our single celled organism to evolve? One answer would be a web page. Web pages can be really complex. In fact, web pages can seem as complex as actual organisms we see around us (like lions, and tigers, and Bare-eyed Mynas). These organisms aren’t just giant cells. They’re composed of large numbers of small cells that reproduce and (occasionally) mutate on their own. This is more of what we’re looking for. What’s a single cell on a web page? It’s any html element.
We could essentially use every html element as a primitive, stuff like <p><h1><span><ul> and so on. Every semantically meaningful element (like a <p> or a <ul>) is just a specialized case of two genetic elements, a <span> or a <div>. A <div> describes a block level element and a <span> describes an inline element. In practice we can reduce this even further. A <span> is just a <div style="display:inline;">. So that leaves us with a div as our single celled organism. A div can be modified by css to represent any element we can come up with on a normal web page. An <img> tag is just a <div> with a fixed height, width, and background image. Well, almost. Certain things are a little hard to do purely with css. We can’t make a <div> an <a>, for example, or an <iframe>. We can mimic this behavior with a little bit of javascript thrown in (<div onclick="location.href='http://www.example.com';" style="cursor:pointer;"></div> is equivalent to an <a>, for example).
Environment
Each of these <div>s has to exist in a shell of some sort - an environment. The simplest possible one we could use looks like this:
<html> <body> <div class="genetic-div">...</div> </body> </html>
The DIV Genome
DNA, Deoxyribonucleic acid, is the genetic instruction for the development of all (known) organisms. The genome is encoded in dna to allow the preservation to genes, traits (this is a vast oversimplification). For our divs, what describes them? There are only two properties of a div we might be concerned about. What’s in it (the content) and what it is like (or what it looks like, the style). For us, css is dna, for all intents and purposes. Css describes a div very similar to the way our dna describes us. To end up with a different element, you have only to change its css. Elements with borders have the border property enabled. Elements with large text have a large font-size selection.
In our use of css as dna, we will break each property up as much as possible and limit them as much as appropriate. So we won’t use just padding, we’ll use padding-left, padding-right, padding-top, and padding-bottom. This will let us think of each one as an individual trait. Padding-left might be a useful mutation in an environment where no other divs have padding. It would set the element off visually from the others in a linear presentation. We may also impose caps on the range of possible values, such as setting the color property to only mutate to traditional web colors - otherwise there are a lot of possible values to worry about.
Thinking <divs> that genetically evolve, we can construct a general class that applies to these divs. Like this:
.genetic-div
{
generation;
meta;
border-width;
border-style;
border-color;
...
}
What’s up with the generation and meta properties? Well, if we use a stylesheet as the dna we can encode other useful information. Browsers will just ignore it, but we could parse it later to get meta descriptive information about a genetic div that has been breeding in the wild.
Sexual Reproduction
At this point we’re pretty much going to ignore sexual reproduction. Why? Well its a little harder to think about - and for now we can focus on alternative means. If we keep sexual reproduction in mind we could always add it later. The real problem with sexual reproduction for us is it requires some mating preference relationship, more than one div (possibly lots, if they breed like rabbits instead of like pandas), and possibly some sense of proximity within a web page (divs should mate with nearby, similar, divs). It also would require a better understanding of the familiar hierarchy of a div. At some point two divs that have diverged enough should not be able to mate, the dna is just too different. Without this concept we just end up with a mess of genetic soup. This sounds like a lot of extra work, so we’ll focus on asexual reproduction and mutation as our mechanism for evolving.
Asexual reproduction & Mutation
This kind of reproduction is easy. Here a genetic div can reproduce, immediately die, and be replaced by its descendent offspring. It passes all of its genetic code onto its offspring. Basically the only work here that we need to do is decide how often reproduction occurs, the necessary requirements for reproduction, and what the chance for mutation is (otherwise we would never see any genetic change).
DIV Death
We don’t want to start thinking about the health of a div too much, they aren’t tamagochi and we don’t want to start thinking of them heavily like that - they’re just design elements after all. We’ll deal with genetic death, not necessarily organism death. Death occurs when a div isn’t fit enough to pass on its genetic material after its reproduction cycle has come up. At this point, we start over with a fresh seed div and the cycle of life begins again. Ok, well maybe we can put an “oh no you killed your div” message up for a few cycles.
Fitness
We have a world (the web), so what is the necessary fitness for a div? Fitness implies that a div is good at living in the world, its genetic material is valuable and should be passed on. The easy answer is that divs live on mouse impressions - or more directly, divs live on attention. Attention is kind of hard to quantify. Optimistically we might say that attention implies a good, usable design. We will ignore the inner content of the div to some degree, since we have selected it to be both important and standard (in our test it will be the about sidebar text for a site).
Mutation is a difficult thing. In nature it occurs rarely, and it would be hard to notice its affects at a local level. Strangely enough, there is a bit of intelligent design thinking related to taking this to the web. The web page environment is a structured, and (hopefully) intelligently designed world. It did not evolve, at least, not in a purely mechanical sense. This implies that the best state for growing organisms by default is the natural look of the web page itself. So, while we have a large number of possible css values for our dna, we prefer that are similar to the webpage environment. This allows the div to evolve the design of a webpage by watching it. Of course, in are early stages we may start out with relatively lean webpages that don’t have that much style information to match - a clean slate.
Implementation
Let’s start to sketch out the implementation details for sending a genetic div out into the world. We’re going to be using a few technologies, namely php and sqlite, which are well known and fairly ubiquitous. For those of you who care, we won’t be using any special php5ness.
Notation:
We’ll be using css as our genome. We may also want a short form for writing our ‘dna’, and of course its also represented to some degree in our database (sqlite file). That leaves us with three notations:
Long form, represents properties like this: border-width:5px; font-size:16px;
Short form, represents properties like this: |bw5|fs16|
Data form, represents properties in sqlite syntax
The only reason to even mention this is that short form will be useful for quickly describing the genetic information for a div. Assuming default values, |4|bw5|fs16| represents a fourth generation div where the border-width and the font-size have mutated, but everything else is the same.
Functional Requirements:
Our divs need exactly two things to work.
They need a css file that updates and stores are genetic description.
They need a way to evaluate fitness, we’ll do this with a little piece of javascript.
The first part is pretty easy. We just use a dynamically generated stylesheet that we make with php. It stores information locally in a sqlite file and updates the div with the appropriate style information. The second part is a little more difficult.
>
<html> <head> <link rel="stylesheet" type="text/css" media="screen" href="genetic-css.php" /> <script language="JavaScript" src="mouse-nutrition.js"></script> </head> <body> <div id="class="genetic-div">This is a genetic div, let's watch it grow.</div> </body> </html>
That php css file looks something like this.
<?php header("Content-type: text/css");
// Get Generation Values
include 'environment-topsoil';
$db = sqlite_open("genetic-memory.sqlite");
// Set Vars
...
?>
// Set Properties
generation:<?php get_generation(); ?>;
border-width:<?php get_border-width(); ?>;
...
The javascript is a little more complicated, but essentially we’ll want to add an event listener to detect mousing over the div (by its classname) and a timer to know how long the mouse is there. Mouse tracking functions only as a rough analog to eye tracking - not everyone uses their mouse as their eyes. Enough people do, however, that we can make use of this as our ‘attention’ guage. We’ll pass the results with an ajax call to a local processor script that controls mutation.
the-ooze.php is our TMNT inspired processor file. It determines if a div gets enough attention and (consequently if it mutates or not). environmental-topsoil functions as our config file. In addition to iterating the range of all possible values, it also sets config options for how many views a generation lives for, the necessary amount of attention to reproduce, and the change for mutation. Changing these values could set up situations where certain properties may be more or less likely to mutate, and so on. It writes its changes to the mysql file (our genetic memory) and the next time the div is loaded, the changes are observed.
Future Work
That’s as far as we’ve gotten at this point, but we think this represents a pretty good sketch of how the idea will work. Hopefully we’ll be able to expand on this and (after some testing) get the code out there. That said, there are tons of interesting aspects of this we can start thinking about now, even though we won’t get to them until later.
Future functions
Exporting the css file. This isn’t necessary right now, but it would be nice to write out a css file describing the current div generation.
Identifying DIV species
Not all divs are created equal. Some divs either have enough semantic weight (h1 or p) or some special abilities (iframe,a) that make them definitely different. Right now there is no sense of how different one div is from another. How can we start to think about DIV species?
Predators & Competition
Not to get too fanciful here (the goal is to solve a problem, not recreate the entire process of evolution) but what kind of predators eat divs? Another way of thinking about this is just competition. Right now we are only considering if the attention meets some kind of threshold. This isn’t always ideal or appropriate. We might want to reward divs that get a percentage of the total attention paid to a page. This more accurately puts them into a competition with other divs.
Sexual Reproduction, Passive Traits, Migration
The next big step in terms of div evolution would be to consider divs that mate, this would allow more interesting behavior (like adding passive traits), but also comes with more complexity. Do divs travel (maybe they travel on links or trackbacks)? Can they migrate? What kind of divs can reproduce with each other.








