Monday, 27 October 2008

Time, Sounds & Neural Netwrks

I have been thinking about how to enable the critters in Polyworld to communicate and I have come upon an interesting problem.

Neural networks update a number of times each second but sounds last for a while. The problem is how to process such sounds. Neural networks work by looking at a set of inputs and producing a set of outputs but sound inputs last longer than a single update.

I could follow research and set up things so that critters produce communications which fit neatly for each update. This is the way that almost all research projects I have seen do it.

Another way is how speech recognition systems do it. These systems segment sound into a chunk and then present that chunk to a neural network. In this situation, the role of the neural network is to associate the input with some output. There is still the problem of when to segment sound.

If I choose to handle this problem, it might add too much complexity to my work. At the same time, limiting communication to fit into neat packs or segmenting sound does not seem right.

When I think about it, it seems strange that this problem occurs. Can we set it up so that the critters can evolve the ability to handle this problem themselves? For example, a critter can learn to remember a sequence of sound samples and combine them together before processing sounds.

This problem might be a problem because I am working on a computer where things are digital and discrete.


  1. I think you could simply model the sound as a signal that persists in time. The vision of a Polyworld agent effectively persists in time. The inputs to the vision neurons is updated each time step, being derived from the agent's point of view at each time step. You could similarly determine a sound field for the agent at each time step that determined the inputs to some new auditory neurons. No guarantees about how they'd evolve and learn to use their new sense of hearing, but it'd be interesting. The hardest part would be that sound field, as it would have to be a brand new data structure in Polyworld, and you'd have to have reasonable sound generation, propagation, and decay rules/equations. Shouldn't be that bad, though.

  2. Hi Larry!

    Yep, that is what I am doing at the moment. I've been using the past few days to read through the code for the Simulation, Critters, Brains and Genomes.

    The first version will enable the critters to pass binary strings to each other and to give critters extra input and output neurons.

    Since the neural networks in Polyworld can evolve to be either feed forward or recurrent, it will be interesting to see if the critters evolve to combine a group of strings to increase their vocabulary, essentially developing the ability to handle temporal patterns.

    We can limit the binary string to be short so that there are only 26 combinations like our alphabet, e.g., 'a', 'b' and 'c'. At this stage the critters have a vocabulary with 26 words and even though situations provide context and allow some words to be reused there are bound to be situations where critters need more words.

    Critters with recurrent networks can potentially combine symbols to increase their vocabulary. A symbol could mean "I" and another could mean "hungry." The ability to combine symbols into a sentence is powerful.

    Another more basic research path is limiting critters to making sounds that you can only tell the difference over time. For example "angry" and "apple". You can only detect the difference if you have the ability to analyse the sound pattern over time in the way the Tungara Frogs experiment did. This another way to encourage the development temporal pattern handling from the other end.

    However, one step at a time and the first step will be adding input and output neurons and allowing critters to exchange binary strings.