Mathematical Principles of Reinforcement

Studies of mental functions, behaviors and the nervous system.

Moderators: kiore, Blip, The_Metatron

Mathematical Principles of Reinforcement

#1  Postby Mr.Samsa » Mar 02, 2010 11:51 pm

NOTE: This was an essay I wrote a while ago but since the topic seems to be relevant to a number of recent discussions I've had, I thought it might be good to reproduce my article here. If any part doesn't make sense it's most likely due to a formatting issue so let me know - otherwise, it's due to the fact that I'm not very good at explaining things... :grin:

It's a bit long sorry, but here it is.

MATHEMATICAL PRINCIPLES OF REINFORCEMENT

Why do organisms behave? Traditionally, the experimental analysis of behaviour has primarily concerned itself with the issue of how organisms interact with their environment, focusing on the descriptive analysis favoured by Skinner. It is this approach which produced the law of effect; when a response is followed by a reinforcer, then the probability of that response being emitted in the future increases. However, a purely descriptive framework does not explain why a reinforcer reinforces.

Killeen (1994) formulated the mathematical principles of reinforcement (MPR) consisting of the concepts of arousal, constraint and coupling in an attempt to generate a general theory of reinforcement. The combination of these three concepts describe how incentives come to motivate certain behaviours, how these behaviours are physically constrained by time, and how specific reinforcers come to be associated with specific responses. For MPR to be accepted as a valid framework on which to base behavioural theories, it is necessary to demonstrate that they can not only account for current data but also produce novel predictions that are consistent with existing behavioural theories.

The concept of behavioural mechanics, as presented by Killeen (1995), emphasises the need to move away from statistical explanations and toward causal explanations. This means that instead of describing the equilibrium of steady-state behaviours, behaviour analysts need to study the forces that cause organisms to move/behave such as motivation, learning among other factors; this is defined as a dynamics of behaviour.

Background

In ‘Mechanics of the Animate’, Killeen (1992) invoked a Newtonian approach to understanding behavioural trajectories; specifically that of the second law of motion which he describes as “Change of motion is proportional to the resultant force, and in the direction in which that force is impressed” (p. 457). The behavioural forces are incentives (reinforcers) that attract organisms in behavioural space. Before considering how reinforcement affects the shaping of behaviour, Killeen argues that first we must understand the nature of the ‘response’ and how different dimensions of the response (such as locus, force, tempo, and topography) interact with the fundamental excitatory property of reinforcement.

The spread of effect. The first thing we must consider is the fact that the control of a reinforcer is not constrained by the single response that precedes it, but rather it can reach back in time to the penultimate response (the response prior to a determined pause) along a delay of reinforcement gradient with decreasing effectiveness (Catania, 1971) . To explain this, Killeen assumes that it is the responses or events in the recent memory of the subject that are increased in probability through reinforcement, and not necessarily the responses that the experimenter observes. This means that maximum strengthening of association will occur when reinforcement is contingent on just those events in the subject’s memory; that is, the acquisition of contingencies will occur the most rapidly when the experimenter’s definition of response matches that of the organism’s definition.

Delay of reinforcement gradient. To discover the subject’s definition of response, an experiment can be set up where the reinforcement contingencies shift on a continuum. The area where there is the most responding is inferred as representing the subjects understanding of the response based on it’s memory of it. When determining the shape of the gradient, Killeen suggests that common sense can help us narrow down possible shapes for the delay of reinforcement gradients; that is, we can eliminate gradients that increase with delay as this would suggest memory increases as a function of time, and gradients that are unweighted as this would suggest that memory would be insensitive to time. Instead, he proposes an exponential decay model for memory, shown in Equation 1:

β = 1− e−λd (1)


where β represents the proportion of memory allocated for a target response, λ is the intrinsic rate of decay and d is the delay. This was then extended into a Exponentially-Weighted Moving-Average (EWMA) to account for the sequences of discrete memory events:

M(n) = β * y(n) + (1- β) * M(n-1), 0 β 1 (2)


where the current memory is represented by M(n), y(n) is the current response’s relevant attribute, M(n-1) is the previous memory, and beta is the currency parameter. The most recent event is salient when beta = 1, and there is no emphasis on prior events. In contrast, when the beta value is small, most of the organism’s memory is occupied by prior events. Killeen suggested that this form of memory decay is most appropriate for use in MPR due to its computational convenience, it makes minimal demands on memory (in accordance with neural networks), as well as for the intuitiveness of the equation.

Three Principles

Killeen and Sitomer (2003) described the concept of principles and models as being similar to the integrated systems of skeleton and muscle - even though muscles may be able to support an ill-suited skeletal structure for some time, eventually one or the other must give way. In science it is necessary to have a valid underlying theory that guides the growth of new models in order to understand the world. Killeen (1994) argues that his mathematical principles of reinforcement act as the skeleton upon which behavioural theories grow.

First Principle: arousal (A) is proportional to the rate of reinforcement: A = ar. This principle concentrates on the process of arousal which refers to the activity or motivation in an organism that is caused by the presentation of a reinforcer. Although it seems to be natural to expect that the delivery of an incentive will increase the activity of the organism, the concept was explored in detail by Killeen, Hanson, and Osborne (1978). Their first experiment measured the rate of decay in activity during an extinction situation; the pigeons were fed once in a chamber where their activity was recorded by the floor panels and then after 30 minutes had elapsed they were returned to their home cages.

Graph 1.png
Graph 1.png (28.14 KiB) Viewed 2274 times

Figure 1: The average rate of activation of floor panels as a function of time since feeding. The top two curves are offset by the factors noted (x10, x100). Straight lines in these semi-logarithmic coordinates evidence exponential decay of activity. Graph taken from Killeen et al. (1978)


Figure 1 shows that activity is highest immediately following food delivery and it then decays exponentially as time since the reinforcer delivery increases. Killeen et al. (1978) used the exponential decay function shown in Equation 3:

b(t) = b1 e − t/α (3)


The parameter b1 refers to the intercept, which had an average of 9.7 responses/min, and α is the time constant, which on average was 6 minutes. So following a reinforcer, the total number of responses is the integral of the exponential function: αb1.

In a second experiment, Killeen et al. (1978) investigated what happened when the rate of feeding was increased from a single daily trial to various fixed-time schedules (FT30 and FT50). Their results showed that the activity of the pigeons increased at a level much greater than that found in the single trial condition. This suggests that the rate of feeding affects the asymptote of activity and also that the excitement produced by periodic feeding seems to cumulate with increasing frequency of reinforcement, as shown in Figure 2. The term arousal was used to describe this state of excitement.

Graph 2.png
Graph 2.png (49.02 KiB) Viewed 2274 times


Figure 2: The way arousal may cumulative as a function of successive incentives, given the parameters in Equation 3 and using incentive intervals of 30-sec and 120-sec. Graph taken from Killeen et al. (1978).


Killeen et al. (1978) note, however, that the rate of responding is not exclusively a function of the rate of reinforcement. To demonstrate that the rate of responding is affected by factors other than the rate of reinforcement, they plotted responding over time using increasing DRO schedules and an FR10 schedule. The expected result – that as DRO contingencies increase responding should become increasingly suppressed – was observed, as shown in Figure 3. The FR schedule appeared to encourage responding, which was also expected.

Graph 3.png
Graph 3.png (23.32 KiB) Viewed 2274 times

Figure 3: The effect of DRO and FR contingencies on responding. Graph taken from Killeen (1975).


Killeen and Sitomer (2003) suggest that the lines fitted by the generalised Erlang distribution in Figure 3 show the animals focal search around the food hopper immediately following the food delivery, then they remain relatively still for a period of time that is exponentially distributed with mean 1/β. This period is followed by a vigorous general search with mean 1/γ, and then there is another focal search around the hopper. When this activity is measured by proximity switches, the probability that an animal will be engaged in general search is proportional to e−βt − e−γt.

So the patterning in the data can be accounted for by these exponential processes, meaning that the parameter A is free to reflect the asymptotic level of arousal in situations free from competing responses that are not usually recorded in standard experimental procedures. They modelled the cumulative increase in arousal as a function of feeding using Equation 4:

An = A1αr (1 – e –n/αr) (4)


A1 represents the initial increase in arousal following the presentation of the reinforcer (this is the intercept, b1, in Equation 3), α is the time constant of the curves, r refers to the rate of reinforcement and n is the number of reinforcers. As the asymptotic arousal level approaches infinity (when n increases) Equation 4 is reformulated as such: A∞ = A1αr, which is equivalent to A = ar where the parameter a represents A1α. This is referred to as specific activation.

Killeen and Sitomer (2003) describe the concepts of specific activation (a) and arousal level ( A = ar) as inducing responding through motivational states (A) and the power of particular type of reinforcer that is applied to a specific organism that is at a certain level of deprivation (a). These concepts, however, are hypothetical constructs and as such cannot be measured directly. To circumvent this issue they make the assumption that response rate is proportional to arousal level, and they use a constant of proportionality k responses per second. The parameter k is absorbed into a and thus converted from seconds per reinforcer to responses per reinforcer; in this formulation a now becomes the integral of the exponential decay curves illustrated in Figure 2.

Second Principle: response rate (b) is constrained by the time required to emit a response (δ). As Killeen et al. (1978) demonstrated, response rates can be influenced by the competition generated by other factors and as such we may find that response rates do not always reach their theoretical asymptote (A). The concept of constraint is introduced to account for the limitations of an organism - similar to the notion of responses being placed on a continuum of preparedness (Seligman, 1970). Killeen (1994), however, focuses on the constraint placed on responding by time and how this forces responses to compete for expression; that is, as it takes time to make a response, responses cannot be emitted as immediately as they may be elicited. This means that there is a maximum rate at which animals can perform the target behaviour so increasing the reinforcement rate becomes decreasingly effective at increasing response rates as it approaches this maximum response rate.

Response rate is traditionally measured by looking at the inter response times (IRT). IRT is the time from the moment a response is made until the start of the next, so to derive the response rate from this it is necessary to count the number of target responses which are made within a period and dividing this number by the duration of the period. The IRT is thus composed of two parts; the duration of time that is necessary to make a response (δ), and the time between responses (τ). The ceiling rate of responses can be calculated by dividing the responses by the observation interval: for an observation interval of 1 second, the first response will have 1 second time available to be made but the second response only has τ = 1 – δ seconds of time available. If a response rate, b, is factored in then the formulation is translated to: τ = 1 – bδ, and this calculates the available time for additional responses. The maximum response rate (bmax = 1/δ) means that the observation interval becomes increasingly filled with responses and more difficult for the animal to add more responses – that is, it reaches the ceiling rate.

Killen and Sitomer (2003) argue that there are two ways to measure response rate; the first approach is the one outlined above, dividing the number of responses by the observation interval, b = 1/(δ + τ). The second approach, however, divides the number of responses by the time available for responding: 1/ τ – this is the instantaneous rate. When the response rate, b, is high it must bend under its maximum rate as it can never exceed 1/ δ and so b then becomes the inverse function of response duration, δ.

After looking at the relationship between response rate and probability, Killeen, Hall, Reilly, & Kettle (2002) showed that the instantaneous response rate 1/ τ will be proportional to arousal level, A, if the probability, p, of observing a response in a small interval of time is also proportional to arousal level. This means that the formula A = ar can be rewritten as 1/ τ = ar, and rearranged to show that τ = 1/ar. Taking this into account, our traditional method for calculating response rate, b = 1/(δ + τ), will become the form shown in Equation 5:

b = r / (δr +1/a) (5)


Interestingly, Killeen and Sitomer (2003) compared Equation 5 to Herrnstein’s hyperbola (Herrnstein, 1979) and argued that if we assume 1/ δ is equivalent to Herrnstein’s concept of maximum response rate, k, and if we assume 1/ δa represents his concept of extraneous reinforcers, ro, we find that Herrnstein’s hyperbola (shown in Equation 6) describes the temporal constraint on responding.

b = kr / (r +ro) (6)


Regardless of the arrangement, the equations demonstrate the important point that even if a response is elicited at a rate proportional to A = ar, the responses will not be emitted instantaneously as it takes time to make a response. The concept of constraint, therefore, is a correction for ceiling rate which prevents the equation from the first principle, A = ar, from overestimating the rate of responding.

Third Principle: the coupling between a response and reinforcer decreases with the distance between them. Coupling is the principle of selection, it occurs when both a reinforcer and a response occupy the same memory window and it is the process by which an association is said to be strengthened. The equations derived from the first two principles demonstrate undirected force – behaviour as a function of incentive motivation. The direction of behaviour is produced by the association between a reinforcer with stimuli and responses; so reinforcement is the amalgamation of excitation and association. Since it can be demonstrated that contingencies of reinforcement can affect response rate independently of reinforcement rate effects (Galbicka & Platt, 1986), the examination of coupling effects under different contingency schedules is necessary.

To understand how coupling directs behaviour, Killeen (1994) used ethograms of organisms’ actions. Once a target behaviour has been selected, its rate is then plotted on the vertical axis of a graph and the rather of all other behaviours are plotted on the horizontal axis (as shown in Figure 4).

Graph 4.png
Graph 4.png (17.39 KiB) Viewed 2274 times

Figure 4: The rate of target behaviours plotted against the rate of other behaviours. The diagonal shows the possible allocation of responses for a given level of arousal (iso-arousal contour).


The iso-arousal line (with intercepts 1/δx and 1/δy) is the line where maximum responses rates for a particular level of arousal fall along. Points 1 and 3, in Figure 4, show how changes in A changes the distance of the iso-arousal line from the origin – for example, point 3 will move toward the limiting constraint line at point 1 as arousal level increases. The operating point is moved toward or away from the vertical (for example, from 1 to 2) by contingencies of reinforcement.

A strong coupling effect will push behaviour up the vertical axis, in these cases, correlation with the target behaviour, c, may approach +1.0. Conversely, when the reinforcement of behaviour is incompatible with the target behaviour, such as in DRO schedules, c will be less than 0 and behaviour will move down the vertical axis. When c approaches 0 – when reinforcement is independent of target behaviours – behaviours will be pushed along the horizontal axis.

Graph 5.png
Graph 5.png (13.01 KiB) Viewed 2274 times

Figure 5: The satiation trajectory plotted as an average over six pigeons. Data are taken in 10-min bins. The filled triangle represents the start of the session. Graph taken from (Killeen & Bizo, 1998).


Killeen (1995) found that changes in drive level can affect specific activation, a, but leave coupling invariant. General activity (x-axis) was measured using a stabilimeter and was plotted against key pecks (y-axis) to examine the satiation of a pigeon where key pecks were reinforced every 90-sec with 10-sec access to food (as shown in Figure 5). The decrease of the behaviour trajectory shows that coupling was not affected by drive level – which is consistent with findings of proportional decreases in concurrent time allocation under extinction conditions (Buckner, Green, & Myerson, 1993).

Reinforcement is traditionally understood as the strengthening of a response that preceded the reinforcement. Killeen and Smith (1984), however, showed that reinforcement not only strengthens the last response but also, to a decreasing extent, the responses prior to the final response. They used the term ‘erasure’ to describe the observation of responses other than the target response that displace the memory for earlier target responses and found that pigeons were less able to determine whether it was their target response caused the reinforcer as duration to reinforcer increased.

Graph 6.png
Graph 6.png (17.22 KiB) Viewed 2274 times

Figure 6: The effect of reinforcement on responses prior to the last response. The calculation of the coupling coefficient as the summation of target response traces are shown in the box to the right of the response curves. Graph taken from Killeen and Sitomer (2003).


This issue is one of particular importance to FR schedules – as shown in Figure 6, all responses to the left of the response that is followed by the reinforcer receives a proportion of β of the maximum 1.0 and so the sum of the series can be calculated as 1 – (1 – β)n. This is taken as the discrete coupling coefficient for FR schedules, denoted as cFR. The coupling to the target response approaches 1.0 as n (the number of prior target responses) increases.
To incorporate the coupling coefficient into the framework of MPR, Equation 5 is multiplied by ‘c.’. This formulation describes how arousal is associated with the target behaviour (Figure 7).

b = c.r / (δr +1/a) (7)


Killeen and Sitomer (2003) describe Equation 7 as the fundamental model of MPR as it predicts that the coupling coefficient combined with the response rate that is supported by that rate of reinforcement will equal the target response rates.

b = (c. / δ) - (n / δa) (8)


Equation 8 accounts for the schedule feedback function of FR schedules where reinforcer rate is proportional to response rate : r = b/n. This shows the motivation under ratio schedules. The predictions appear to be consistent with experimental results (Bizo & Killeen, 1997).

On random ratio schedules, they argue that since each response has a probability of 1/VR of being reinforced then the average of the coefficients of FRs n responses, weighted by the probability that the schedule will provide reinforcement on a particular response, is the coupling coefficient.

cVRn= n / (n + (1+β) / β ) (9)


Other factors

Varying rate of reinforcement. As the motivational level (a) and the rate of reinforcement (R) determine the arousal of the organism, according to MPR, Killeen and Bizo (1998) manipulated the rate of reinforcement to assess the effects on response trajectories. They used tandem VI-FR schedules; so after the VI schedule had finished then the subjects were required to make another four responses to receive reinforcement. The sessions alternated from VI 4-min FR 4 schedules to VI 64-min FR 4 schedules – long intervals were used in order to cause response rates to vary over a substantial range.

Graph 7.png
Graph 7.png (15.9 KiB) Viewed 2274 times

Figure 7: The up-triangles represent the variable interval (VI) 4-min schedules and the down-triangles represent the VI 64-min schedules. The large triangles are the end of the schedule. Graph taken from Killeen & Bizo (1998)


Figure 7 shows that the rate of key pecking and movement decreasing in the VI 64-min condition and increasing in the VI 4-min trial; these results are generally consistent with MPR’s prediction of a proportional relationship. Killeen and Bizo (1998), however, note that there is a slight concavity in the trajectories which they interpret as a possible hysteresis effect. This effect could have been caused by difference between conditioning of key pecking and general behaviour – in the terminology of preparedness (Seligman, 1970), their results suggest that key pecking was a more prepared response than other responses as it extinguished slower and recovered more quickly.

Is this a valid inference? Although the explanation makes sense in terms of the MPR model, it is correct to assume that an arbitrary response defined by the experimenters would be more prepared than general behaviour emitted by the animal? Further research would be necessary to determine whether the systematic deviations found by Killeen and Bizo (1998) can be explained through a construct such as preparedness or whether they are due to concurrent changes in coupling and arousal.

Memory windows. There is a possible conflict with regards to the concept of coupling. The concept assumes that when a response and a reinforcer are both present in the organism’s memory then the association will be strengthened, however, is this not to be expected? As the act of remembering, if assumed to be an extension of attending, has been shown to be subject to the principles of reinforcement (Nevin, Davison, Odum, & Shahan, 2007) so the notion of coupling becomes circular, or at best it describes a correlative relationship between two factors which are only related due to the third independent factor of reinforcement. That is; the idea of coupling is based on the notion of a response and a reinforcer occupying the same dimensional space determined by the organism, so the association is made when both the response and the reinforcer are attended to. But since the process of attending is a function of reinforcement the construct of memory windows is just a rewording of the traditional view of reinforcement.

Changes in the value of incentives. Killeen (1995) looked at how the MPR fitted into the open and closed economies and how they could account for the difference in responding (Hursh, 1980). To do this, he devised a formula to predict the effect that hunger would have on the value of an incentive (as shown in Figure 10).

dt= do+ M-mRt (10)


The parameter dt refers to the deficit (hunger) at time t, dt is the initial deprivation level, mR is the input rate (the size of incentive and rate of reinforcement) whereas M is the output rate (the metabolic rate).

Killeen (1995) found that an exponential form of his hunger equation produced good fits in open and closed economies as it could account for the different deprivation levels inherent in each situation; that is, in closed economies the initial deprivation will usually be minimal and grow with time, whereas do for open economies will be much higher. Although the concept on hunger affecting the motivation of an organism would seem to be a reasonable assumption to make, difficulties arise when trying to extrapolate this concept to other qualitatively different reinforcers and we effectively run into the same problem that faced Hull’s (1943) drive-reduction theory.

The formulation faces the practical issue of determining the value of several factors prior to the start of an experiment, such as how much food is consumed with each reinforcer, the stomach capacity of the individual animals, as well as other possible free parameters. This issue was outlined by Killeen and Bizo (1998) who attempted to circumvent the complexity issue by showing changes in behaviour as a result of changes in arousal or coupling can be understood in terms of movement in a behaviour space, similar to Figure 4.

Conclusion

MPR clearly and precisely outlines the key factors that guide behaviour; arousal, constraint and coupling. It seems probably that any framework that underlies the mechanics of behaviour will necessarily need to take into account the notion that it is the animal's representation of the response which is controlled by the reinforcer and not the response that the experimenter sets up. An issue that does not appear to have been investigated in detail, is how reinforcers other than food fit into the theory. Since a large proportion of research into describing the principles have been focused on how changes in feeding rates affect arousal (and thus MPR) it is important to attempt to replicate these results to other reinforcers to demonstrate generalisability as it could be possible that arousal does not cumulate in the same way (or at all) when using a reinforcer such as water, or a conditioned reinforcer.

Overall, MPR appears to lay down the bones for what could develop into a fully functional working theory that can account for the basic aspects of reinforcement in an accurate and quantitative way.

References

Bizo, L. A., & Killeen, P. R. (1997). Models of ratio schedule performance. Journal of Experimental Psychology: Animal Behavior Processes , 23, 351-367.

Buckner, R. L., Green, L., & Myerson, J. (1993). Short-term and long-term effects of reinforcers on choice. Journal of the experimental analysis of behavior , 59, 293-307.

Catania, A. C. (1971). Reinforcement schedules: The role of responses preceding the one that produces the reinforcer. Journal of the Experimental Analysis of Behavior , 15, 271-287.

Galbicka, G., & Platt, J. R. (1986). Parametric manipulation of interresponse-time contingency independent of reinforcement rate. Journal of Experimental Psychology: Animal Behavior Processes , 12, 371-367.

Herrnstein, R. J. (1979). Derivatives of matching. Psychological Review , 86, 486-495.

Hull, C. L. (1943). Principles of behavior. New York: Appleton-Century.

Hursh, S. R. (1980). Economic concepts for the analysis of behavior. Journal of the Experimental Analysis of Behavior , 34, 219-238.

Killeen, P. (1975). On the temporal control of behavior. Psychological Review , 82, 89–115.

Killeen, P. R. (1995). Economics, ecologics, and mechanics - the dynamics of responding under conditions of varying motivation. Journal of the Experimental Analysis of Behavior , 64, 405-431.

Killeen, P. R. (1995). Economics, Ecologics, and Mechanics: The dynamics of responding under conditions of varying motivations. Journal of the Experimental Analysis of Behavior , 64, 405-431.

Killeen, P. R. (1994). Mathematical principles of reinforcement. Behavioral and Brain Sciences , 17, 105-172.

Killeen, P. R., & Bizo, L. A. (1998). The mechanics of reinforcement. Psychonomic Bulletin & Review , 5, 221-238.

Killeen, P. R., & Smith, J. P. (1984). Perception of contingecy in conditioning: scalar timing, response bias, and the erasure of memory by reinforcement. Journal of Experimental Psychology: Animal Behavior Processes , 10, 333-345.

Killeen, P. R., & Sitomer, M. (2003). MPR. Behavioural Processes , 62, 49-64.

Killeen, P., Hall, S., Reilly, M., & Kettle, L. (2002). Molecular analyses of the principal components of response strength. Journal of the Experimental Analysis of Behavior , 78, 127-160.

Killeen, P., Hanson, S., & Osborne, S. (1978). Arousal: its genesis and manifestation as response rate. Psychological Review , 85, 571–581.

Killen, P. R. (1992). Mechanics of the animate. Journal of the Experimental Analysis of Behavior , 57, 429-463.

Nevin, J. A., Davison, M., Odum, A. L., & Shahan, T. A. (2007). A Theory of Attending, Remembering, and Reinforcement in Delayed Matching to Sample. Journal of the Experimental Analysis of Behavior , 88, 285–317.

Seligman, M. E. (1970). On the generality of the laws of learning. Psychological Review , 77, 406-418.

Skinner, B. F. (1966, September 9). The phylogeny and ontogeny of behavior. Science , 153, pp. 1205–1213.
Image
Mr.Samsa
THREAD STARTER
 
Posts: 11370
Age: 38

Print view this post

Re: Mathematical Principles of Reinforcement

#2  Postby SteveBrewer » Mar 03, 2010 6:03 am

Ok Samsa I can't just leave you hanging on this one so here goes. Keep in mind that I am still trying to wrap my head around the implications. That said, wouldn't the model break down the more complex the behavior becomes. I mean the more competing factors you must consider for more complicated behaviors wouldn't the formula lose cohesion due to the added variables. Or are we to reduce the competing factors as to make it applicable. If that is confusing then let me know and I will ponder some more and try and articulate it better.
User avatar
SteveBrewer
 
Posts: 69
Age: 55
Male

Print view this post

Re: Mathematical Principles of Reinforcement

#3  Postby Mr.Samsa » Mar 03, 2010 7:05 am

SteveBrewer wrote:Ok Samsa I can't just leave you hanging on this one so here goes. Keep in mind that I am still trying to wrap my head around the implications. That said, wouldn't the model break down the more complex the behavior becomes. I mean the more competing factors you must consider for more complicated behaviors wouldn't the formula lose cohesion due to the added variables. Or are we to reduce the competing factors as to make it applicable. If that is confusing then let me know and I will ponder some more and try and articulate it better.


Hey Steve, thanks for what will most likely be the only response :grin:

Well, Killeen's theory is more focused on the mechanism behind reinforcement, rather than describing and predicting behaviors directly. So currently we have a number of behavioral events (both simple and the more complicated) which can be explained by reinforcement theories but what Killeen is attempting to do is to understand how reinforcement actually works.

The explanations we have for complicated behaviors most likely wouldn't change if Killeen's theory was valid, it would just be extra information about the underlying mechanism. The concept of reinforcement is similar to Darwin's explanation of evolution (from Killeen's perspective) where he is correct enough to be able to explain and predict a number of phenomena with a basic framework, but he couldn't precisely describe the mechanism for how traits and characteristics are passed on - MPR in this analogy would be genetic theory where it describes the mechanism behind heritability.

I probably just made it more confusing and convoluted now, huh?..
Image
Mr.Samsa
THREAD STARTER
 
Posts: 11370
Age: 38

Print view this post

Re: Mathematical Principles of Reinforcement

#4  Postby SteveBrewer » Mar 03, 2010 7:13 am

Mr.Samsa wrote:
Hey Steve, thanks for what will most likely be the only response :grin:

Well, Killeen's theory is more focused on the mechanism behind reinforcement, rather than describing and predicting behaviors directly. So currently we have a number of behavioral events (both simple and the more complicated) which can be explained by reinforcement theories but what Killeen is attempting to do is to understand how reinforcement actually works.

The explanations we have for complicated behaviors most likely wouldn't change if Killeen's theory was valid, it would just be extra information about the underlying mechanism. The concept of reinforcement is similar to Darwin's explanation of evolution (from Killeen's perspective) where he is correct enough to be able to explain and predict a number of phenomena with a basic framework, but he couldn't precisely describe the mechanism for how traits and characteristics are passed on - MPR in this analogy would be genetic theory where it describes the mechanism behind heritability.

I probably just made it more confusing and convoluted now, huh?..


Actually no, I think your analogy works. Considering what your are saying I think perhaps my response was trying to expand a basic model to a more complex one without considering that it is just that, a basic model. I will have to ponder some more on the implications and come up with a mathematical stumper for you. Though given my admitted deficit in mathematical intricacies you may be waiting awhile. :grin:

On another note, I am working on the Prep thread, should be done in a few.
User avatar
SteveBrewer
 
Posts: 69
Age: 55
Male

Print view this post

Re: Mathematical Principles of Reinforcement

#5  Postby Mr.Samsa » Mar 03, 2010 7:21 am

SteveBrewer wrote:
Actually no, I think your analogy works. Considering what your are saying I think perhaps my response was trying to expand a basic model to a more complex one without considering that it is just that, a basic model. I will have to ponder some more on the implications and come up with a mathematical stumper for you. Though given my admitted deficit in mathematical intricacies you may be waiting awhile. :grin:


Just ask me what the forward slash (/) means in the equations and I'll be stumped :wink:

SteveBrewer wrote:On another note, I am working on the Prep thread, should be done in a few.


Excellent! :thumbup:
Image
Mr.Samsa
THREAD STARTER
 
Posts: 11370
Age: 38

Print view this post

Re: Mathematical Principles of Reinforcement

#6  Postby SteveBrewer » Mar 03, 2010 7:28 am

Mr.Samsa wrote:
Just ask me what the forward slash (/) means in the equations and I'll be stumped :wink:


:lol: I was thinking of something a bit tougher than that, but at least now I have a starting point. :dance:
User avatar
SteveBrewer
 
Posts: 69
Age: 55
Male

Print view this post

Re: Mathematical Principles of Reinforcement

#7  Postby SteveBrewer » Mar 03, 2010 8:15 am

New thread is up :cheers:
User avatar
SteveBrewer
 
Posts: 69
Age: 55
Male

Print view this post


Return to Psychology & Neuroscience

Who is online

Users viewing this topic: No registered users and 1 guest