What are good examples of genetic algorithms/genetic programming solutions?

123,976

Solution 1

Not homework.

My first job as a professional programmer (1995) was writing a genetic-algorithm based automated trading system for S&P500 futures. The application was written in Visual Basic 3 [!] and I have no idea how I did anything back then, since VB3 didn't even have classes.

The application started with a population of randomly-generated fixed-length strings (the "gene" part), each of which corresponded to a specific shape in the minute-by-minute price data of the S&P500 futures, as well as a specific order (buy or sell) and stop-loss and stop-profit amounts. Each string (or "gene") had its profit performance evaluated by a run through 3 years of historical data; whenever the specified "shape" matched the historical data, I assumed the corresponding buy or sell order and evaluated the trade's result. I added the caveat that each gene started with a fixed amount of money and could thus potentially go broke and be removed from the gene pool entirely.

After each evaluation of a population, the survivors were cross-bred randomly (by just mixing bits from two parents), with the likelihood of a gene being selected as a parent being proportional to the profit it produced. I also added the possibility of point mutations to spice things up a bit. After a few hundred generations of this, I ended up with a population of genes that could turn $5000 into an average of about $10000 with no chance of death/brokeness (on the historical data, of course).

Unfortunately, I never got the chance to use this system live, since my boss lost close to $100,000 in less than 3 months trading the traditional way, and he lost his willingness to continue with the project. In retrospect, I think the system would have made huge profits - not because I was necessarily doing anything right, but because the population of genes that I produced happened to be biased towards buy orders (as opposed to sell orders) by about a 5:1 ratio. And as we know with our 20/20 hindsight, the market went up a bit after 1995.

Solution 2

I made a little critters that lived in this little world. They had a neural network brain which received some inputs from the world and the output was a vector for movement among other actions. Their brains were the "genes".

The program started with a random population of critters with random brains. The inputs and output neurons were static but what was in between was not.

The environment contained food and dangers. Food increased energy and when you have enough energy, you can mate. The dangers would reduce energy and if energy was 0, they died.

Eventually the creatures evolved to move around the world and find food and avoid the dangers.

I then decided to do a little experiment. I gave the creature brains an output neuron called "mouth" and an input neuron called "ear". Started over and was surprised to find that they evolved to maximize the space and each respective creature would stay in its respective part (food was placed randomly). They learned to cooperate with each other and not get in each others way. There were always the exceptions.

Then i tried something interesting. I dead creatures would become food. Try to guess what happened! Two types of creatures evolved, ones that attacked like in swarms, and ones that were high avoidance.

So what is the lesson here? Communication means cooperation. As soon as you introduce an element where hurting another means you gain something, then cooperation is destroyed.

I wonder how this reflects on the system of free markets and capitalism. I mean, if businesses can hurt their competition and get away with it, then its clear they will do everything in their power to hurt the competition.

Edit:

I wrote it in C++ using no frameworks. Wrote my own neural net and GA code. Eric, thank you for saying it is plausible. People usually don't believe in the powers of GA (although the limitations are obvious) until they played with it. GA is simple but not simplistic.

For the doubters, neural nets have been proven to be able to simulate any function if they have more than one layer. GA is a pretty simple way to navigate a solution space finding local and potentially global minimum. Combine GA with neural nets and you have a pretty good way to find functions that find approximate solutions for generic problems. Because we are using neural nets, then we are optimizing the function for some inputs, not some inputs to a function as others are using GA

Here is the demo code for the survival example: http://www.mempko.com/darcs/neural/demos/eaters/ Build instructions:

  • Install darcs, libboost, liballegro, gcc, cmake, make
  • darcs clone --lazy http://www.mempko.com/darcs/neural/
  • cd neural
  • cmake .
  • make
  • cd demos/eaters
  • ./eaters

Eaters Screenshot

Solution 3

In January 2004, I was contacted by Philips New Display Technologies who were creating the electronics for the first ever commercial e-ink, the Sony Librie, who had only been released in Japan, years before Amazon Kindle and the others hit the market in US an Europe.

The Philips engineers had a major problem. A few months before the product was supposed to hit the market, they were still getting ghosting on the screen when changing pages. The problem was the 200 drivers that were creating the electrostatic field. Each of these drivers had a certain voltage that had to be set right between zero and 1000 mV or something like this. But if you changed one of them, it would change everything.

So optimizing each driver's voltage individually was out of the question. The number of possible combination of values was in billions,and it took about 1 minute for a special camera to evaluate a single combination. The engineers had tried many standard optimization techniques, but nothing would come close.

The head engineer contacted me because I had previously released a Genetic Programming library to the open-source community. He asked if GP/GA's would help and if I could get involved. I did, and for about a month we worked together, me writing and tuning the GA library, on synthetic data, and him integrating it into their system. Then, one weekend they let it run live with the real thing.

The following Monday I got these glowing emails from him and their hardware designer, about how nobody could believe the amazing results the GA found. This was it. Later that year the product hit the market.

I didn't get paid one cent for it, but I got 'bragging' rights. They said from the beginning they were already over budget, so I knew what the deal was before I started working on it. And it's a great story for applications of GAs. :)

Solution 4

I used a GA to optimize seating assignments at my wedding reception. 80 guests over 10 tables. Evaluation function was based on keeping people with their dates, putting people with something in common together, and keeping people with extreme opposite views at separate tables.

I ran it several times. Each time, I got nine good tables, and one with all the odd balls. In the end, my wife did the seating assignments.

My traveling salesman optimizer used a novel mapping of chromosome to itinerary, which made it trivial to breed and mutate the chromosomes without any risk of generating invalid tours.

Update: Because a couple people have asked how ...

Start with an array of guests (or cities) in some arbitrary but consistent ordering, e.g., alphabetized. Call this the reference solution. Think of a guest's index as his/her seat number.

Instead of trying to encode this ordering directly in the chromosome, we encode instructions for transforming the reference solution into a new solution. Specifically, we treat the chromosomes as lists of indexes in the array to swap. To get decode a chromosome, we start with the reference solution and apply all the swaps indicated by the chromosome. Swapping two entries in the array always results in a valid solution: every guest (or city) still appears exactly once.

Thus chromosomes can be randomly generated, mutated, and crossed with others and will always produce a valid solution.

Solution 5

I used genetic algorithms (as well as some related techniques) to determine the best settings for a risk management system that tried to keep gold farmers from using stolen credit cards to pay for MMOs. The system would take in several thousand transactions with "known" values (fraud or not) and figure out what the best combination of settings was to properly identify the fraudulent transactions without having too many false positives.

We had data on several dozen (boolean) characteristics of a transaction, each of which was given a value and totalled up. If the total was higher than a threshold, the transaction was fraud. The GA would create a large number of random sets of values, evaluate them against a corpus of known data, select the ones that scored the best (on both fraud detection and limiting the number of false positives), then cross breed the best few from each generation to produce a new generation of candidates. After a certain number of generations the best scoring set of values was deemed the winner.

Creating the corpus of known data to test against was the Achilles' heel of the system. If you waited for chargebacks, you were several months behind when trying to respond to the fraudsters, so someone would have to manually review large numbers of transactions to build up that corpus of data without having to wait too long.

This ended up identifying the vast majority of the fraud that came in, but couldn't quite get it below 1% on the most fraud-prone items (given that 90% of incoming transactions could be fraud, that was doing pretty well).

I did all this using perl. One run of the software on a fairly old linux box would take 1-2 hours to run (20 minutes to load data over a WAN link, the rest of the time spent crunching). The size of any given generation was limited by available RAM. I'd run it over and over with slight changes to the parameters, looking for an especially good result set.

All in all it avoided some of the gaffes that came with manually trying to tweak the relative values of dozens of fraud indicators, and consistently came up with better solutions than I could create by hand. AFAIK, it's still in use (about 3 years after I wrote it).

Share:
123,976

Related videos on Youtube

knorv
Author by

knorv

Updated on June 12, 2020

Comments

  • knorv
    knorv almost 4 years

    Genetic algorithms (GA) and genetic programming (GP) are interesting areas of research.

    I'd like to know about specific problems you have solved using GA/GP and what libraries/frameworks you used if you didn't roll your own.

    Questions:

    • What problems have you used GA/GP to solve?
    • What libraries/frameworks did you use?

    I'm looking for first-hand experiences, so please do not answer unless you have that.

    • knorv
      knorv over 14 years
      @Jason: Thanks for suggesting that Google thing. While it appears to be somewhat useful I fail to see how it could answer this question since it is specifically addressing SO-users with GA/GP-experience.
    • Dan Dyer
      Dan Dyer over 14 years
    • Adrian McCarthy
      Adrian McCarthy over 11 years
      "We expect answers to be supported by ... specific expertise...." Check! "[T]his question will likely solicit debate, arguments, polling, or extended discussion." False. There are many answers, but it's not a poll and there aren't a lot of comments or debate in the comments. Why was this closed?
    • Simon
      Simon about 10 years
      The Eureqa program is very good for genetic programming: nutonian.com/products/eureqa
    • huseyin tugrul buyukisik
      huseyin tugrul buyukisik almost 3 years
      I wrote a cuda-accelerated GA to fold proteins for some medicine research project. Using 8x high-end GPUs (Tesla series) was enough to fold a 5000-atom protein within seconds. But it did require a big fitness function. Since cuda did not have in-kernel random number generation (and other things), I had to write all myself.
  • jprete
    jprete over 14 years
    I'm not going to downvote you, but I'm guessing that it's because you didn't do it yourself. The OP specifically asked for things that you had done yourself.
  • San Jacinto
    San Jacinto over 14 years
    And where is your Turing award to go with this story? You must have had some crazy advancements in the science in order to have such an experiment even run on anything but RoadRunner.
  • Nick Johnson
    Nick Johnson over 14 years
    This is pretty much a simplified version of Richard Dawkins' biomorphs.
  • Manuel Araoz
    Manuel Araoz over 14 years
    and what was that novel mapping?
  • Adrian McCarthy
    Adrian McCarthy over 14 years
    @Manuel: Rather than encoding the tour directly in the "chromosome", I encoded a transformation that turns a reference tour into a solution. The transformations are just swaps between cities in the index. So they can be recombined in any old way and still always generate a tour that visits every city exactly once.
  • Kirk Broadhurst
    Kirk Broadhurst over 13 years
    Funny thing about colour is that you can't consider it on its own. Colour consultants do not turn up with just one colour - they come in palettes and schemes. There's no point just choosing one colour on its own. I didn't downvote but your answer is stretching the definition of GA. How do you mutate/crossover one colour? This is more honestly a demonstration of iteratively narrowing a limited dataset.
  • Joel
    Joel over 13 years
    "I think the system would have made huge profits" - yes I bet it worked perfectly in the backtesting environment ;-)
  • MusiGenesis
    MusiGenesis over 13 years
    @Joel: of course it did, but that's not why I think it would have been profitable. It would have made money because of the heavy bias towards buying instead of selling. A system that just bought S&P500 futures at random times between 1995 and 1999 (without any kind of GA nonsense going on) would have made tons of money, but we only know this in retrospect.
  • Eric Normand
    Eric Normand over 13 years
    Joel was probably referring to "overfitting".
  • Eric Normand
    Eric Normand over 13 years
    You need to reserve a bit of your historical data for testing. Best to do cross-fold validation.
  • Eric Normand
    Eric Normand over 13 years
    This maybe explains the downvotes: this seems more like hillclimbing, not GA.
  • aman.gupta
    aman.gupta over 13 years
    Agreed with Eric. You can write a simple NN in under an hour (and in fact, I have done, in an exam), and a basic GA is not necessarily more than a day or two's work. This is more of an A-Life algorithm than a genetic algorithm but we're still talking very simple and feasible stuff here.
  • Nobody
    Nobody over 13 years
    Koza's series on GP is very dense and maybe not for someone who is new to GP. I'd suggest Riccardo Poli's Field Guide to Genetic Programming (available as free html copy) or An Introduction to Genetic Algorithms by Melanie Mitchell
  • Philip Guin
    Philip Guin over 12 years
    This isn't fake in the slightest... Summer after my freshman year, I made a project for funsies very similar to this using XNA in C#, minus the neural networks, usings GAs and a population of creatures with a myriad of varying traits. For instance, one gene controlled their vision- higher meant further sight, lower meant wider sight radius. Obstacles and food would be placed randomly, and creatures would replenish their energy by eating the food. Traits would mutate by adding randomly generated Gaussian numbers to them, resulting in normally distributed genes as in actual evolution.
  • MShams
    MShams over 12 years
    Yes, Its source published under my GA book.
  • lmsasu
    lmsasu almost 12 years
    Hello Amro. Did you make a full comparison between the results obtained with backprop and GA? If so, could you share the comparison results with us? How did you implement the crossover step for two NNs?
  • Amro
    Amro almost 12 years
    @lmsasu: nothing fancy: each string or chromosome in the population represents the weight and bias values of the network, and a simple 1 or 2 points crossover operator was used. From what I recall, it took a long time for the network to train using GA. My implementation was more of a proof of concept than anything else (see here for a toy example of controlling virtual minesweepers)... Anyways there should be a lot papers out there that discuss using GA to not only learn the weights, but also evolve the network structure.
  • alexpinho98
    alexpinho98 almost 11 years
    I think you could have used a neural network to do the parameter tweaking (although it would take longer to be more effective than doing it by hand).
  • Lucas
    Lucas about 10 years
    I work in a research group where this sort of thing (ALife) was what people did everyday. Your story is totally believable, and to be honest I was a little shocked to see that anyone would think it is a fake. Then again, usually the point of doing them is to point out that complex behaviour can arise from very simple systems - I guess the point has not been made well enough.
  • guerda
    guerda almost 10 years
    I found some evidence in his website: www.mempko.com/darcs/neural He states "I provided a neat example of little men in a little world, evolving for survival.". Here is the example code: mempko.com/darcs/neural/demos/eaters
  • Martin Capodici
    Martin Capodici over 9 years
    The "over budget" thing is phoney. Of course they had the money to pay you but they chose not to. That really sucks and shows how big business can take advantage of nice programmers.
  • user3019612
    user3019612 about 9 years
    Thanks! I'm just a bit confused with the swapping aspect. Each chromosome encodes a list of indexes to swap - doesn't that mean an index can appear more than once in the chromosome?
  • Adrian McCarthy
    Adrian McCarthy about 9 years
    Chomosome has indexes c1, c2, c3, ..., cn. "Solution" is array a. Initialize a with your reference list. Then, for each pair of indexes in the chromosome, swap two elements in the solution (temp = a[c1]; a[c1] = a[c2]; a[c2] = temp). It doesn't matter if two indexes are identical, because a will still contain every guest (or city) exactly once.
  • CodeMonkey
    CodeMonkey almost 9 years
    Coding a basic genetic algorithm can be done in a day. A basic neural network is also rather easy. Writing the training for it is a bit harder. But have never used GA for anything at work only my hobby ant's project.
  • johnbakers
    johnbakers almost 9 years
    If I'm not mistaken, using a GA to optimize an NN can be a replacement for training altogether. In this example, there was no actual NN training; the world in which the critters lived was in fact "on the job" training. Since GA work involves generations, I wonder if this example had a threshold by which a new generation was freshly built from the existing survivors, of if the generations were only created by the "mating" of the critters, with no global "reset/loop" for the new generation.
  • CodyBugstein
    CodyBugstein over 8 years
    What do you mean by "shape" in each of which corresponded to a specific shape in the minute-by-minute price data?
  • MusiGenesis
    MusiGenesis over 8 years
    @Imray: each "shape" was just a series of line segments, where each segment represented a change in price over a length of time. So for example, you might have a "shape" where the first segment is 0 change in price over 10 minutes; the second segment is -5 price over 3 minutes; and the third segment is +10 price over 2 minutes. So, any time the market behaved like this (is flat for 10 minutes, then drops about 5 over the next 3 minutes, then climbs 10 over the next 2 minutes, give or take some error figure) the program ...
  • MusiGenesis
    MusiGenesis over 8 years
    ... would trigger whatever action was attached to the shape (buy or sell plus the stops).
  • Todor Balabanov
    Todor Balabanov over 6 years
    Great I have tried something similar: link
  • Dang Manh Truong
    Dang Manh Truong almost 6 years
    Just built it and it's great! Note that to install liballegro, you need to type "sudo apt-get install liballegro4-dev" (not allegro5)