Monday, 27 October 2008

Temporal Patterns, Frogs, Motor Control & Activations

Temporal Patterns

I've been thinking how to handle the problem time, sound and neural networks. A few weeks back I found an interesting book at my uni library. I've been setting the book aside but now seemed like an appropriate time to have a quick scan through it. Luckily I did, because it touches on the problem, gives it a name, suggests an approach and raises additional interesting ideas.

The book is "Neural Networks & Animal Behavior" by Magnus Enquist and Stefano Ghirlanda (2005).

On page 96, Enquist and Ghirlanda talk about Temporal Patterns.

"Consequently, to make decisions based on temporal patterns, sensory processing beyond the level of receptors is needed. Somehow information about prior stimulation must be retained and made available to the decision mechanism."

Consider a network with two input nodes, nodes 1 and 2. Normally each input node receives input x(t) from the world. We can modify the network so that node 2 receives input from node 1 instead. This way, node 1 holds the value x(t) and node 2 holds the value x(t-1).

"By a small change in network organisation, a temporal pattern has been transformed into a pattern of activity of nodes at the same time, which can be used for further processing and decision making."

There are other parts of the book which go into this as well and it seems like this falls under the category of recurrent neural networks. While this approach may or may not be good, it shows that this is a problem that other people have considered and there are possible approaches towards handling it.

Another example is having an extra node between node 1 and the next layer. For example, consider a network with two input nodes and one output node. We will call them nodes 1, 2 and 4. We can add an extra node between 1 and 4 which we will call 3. This creates a delay between nodes 1 and 4.

Frogs

On page 98, Enquist and Ghirlanda discuss research on Tungara Frogs by Phelps and Ryan (1998). These frogs make calls take place over time. Phelps and Ryan created a recrurrent neural network based on work by Elman (1990) that could detect structures in time.

"It consists of a three-layer feed forward network with the additiona of a second group of hidden nodes linked with the first group by recurrent connections. The network input nodes model the frog's ear, with each node being sensitive to a range of sound frequencies. Actual male calls were sampled at small intervals, obtaining a number of time slices for each call. These slices are fed one at a time to the network (an approximation to the continuous time processing that takes place in nervous systems). The activity of the model ear at each time is sent to the first group of hidden nodes. This connects to an output node and to the second group of hidden nodes, which feeds back its activity to the first group. Through this feedback the input to the first group of nodes becomes a function of past as well as current input. This allows the network to set up internal states over time and become sensitive to temporal structure across a series of time slices."

"After training, the network was able to accurately discriminate between calls of different species, and it was also able to predict responding in real females, to both familiar and novel stimuli."

I find it interesting how each node here is senstive to a range of sound frequencies. In speech recognition systems, each node represents a slice in time. For example, if I sample 1 second of speech and break it down into ten parts, then there would be ten input nodes and each node receives a frequency value. I do not like this approach because it does not account for words that are longer than 1 second and because it is more about matching a sample with some kind of index. While that is useful to speech to text applications, it is not so useful for developing communicating critters.

As I was scanning the book, there are other mentions of recurrent neural networks being useful for responding to temporal sequences and to organising output over time (pages 21 and 55).

Motor Control

Another interesting area that Enquist and Ghirlanda discuss is Motor Control (page 115). This relates to organising output over time and relates strongly with ideas about embodiment. I've only skimmed this section and will read it soon.

Activations

As I was reading this, another thought came to mind which was how often should I get these networks to activate? Computational processing power creates an ever rising upper limit for the number of activations per second but does this even make sense? The conclusion I am moving towards is that it doesn't matter. There intuitively seems to be a forgiving upper and lower limit and as long as I don't go silly, it should not matter too much. I can have ten activations per second or perhaps even 60 depending on processing power available to me. Whatever the number is, the network will adapt accordingly. In addition, I can let the appropriate number of activations per second evolve over time.

One of the things that has bugged me is how these networks seem to be so reflexive. I mean they only do something when you pump input from the world into them and that just does not seem right to me. However, Enquist and Ghirlanda's discussions about recurrent networks and about how nodes can contain an activation value from the start which kick starts a process that just keeps on going satisfies this concern.

No comments:

Post a Comment