In the previous posts I talked about Customer Segmentation and about Fuzzy Logic. Customer Segmentation is an easy way to group customers where we assume that those groups share the same interests somewhat. With Fuzzy Logic we discovered that a cup of coffee can be somewhat hot and somewhat warm at the same time and that hot in hot coffee doesn’t mean the same as in hot volcano. In this episode we will try to combine this customer segmentation with fuzzy logic.

Lets start with a demographic we all love and hate: age. Everyone has an age (although some stop counting at 29). It is quite difficult to segment young people into categories but in a group with some help of wikipedia we might come up with some ages and life-phases.

Fuzzy Chart with Ages

Chart with Fuzzy Categories for Ages

A baby is easy, it starts with 0 days but when does a baby change into a toddler? Some might say 6 months, but everyone agrees that 1 year is a toddler. Now we do the same for toddler, when does a toddler change into a child? After this excercise we do the same for child into teenager, young adult and adult. The result might be something like this graph.

Age is, in most systems, a calculated field. That means that the age itself isn’t stored but calculated from the birthdate and current date. The fuzzy segments are based on this age value and because of those calculations it is somewhat more cpu-intensive. The biggest performance issue with this is that the field (usually) isn’t indexed and has therefore a performance impact but for most companies this isn’t a big issue. At the end of this article I’ll include a an example SQL for this calculations.

We can also build a fuzzy segment for Likes Computergames (based on number of games bought), Watches TV (at least some hours each week) and Parents Are Loyal Customers (bought at least a number of items last year).

Now we can combine those categories and define a criteria as “Teenager, Likes to play computergames, Parents are loyal customers” to define which parents get marketinginformation for computergames 1 months before christmas. Traditional segmentation can also use this definition but is a 11.9 year old boy who likes computergames interesting or not? When we, in a traditional segmentation, define Teenager as 12-18yrs we don’t include this boy while he might be very interesting. The same as ‘Loyal Customer’.

But how do we run a calculation on these values?
There are 2 different methods in fuzzy logic but we’ll use one version here. An AND can be formulated as MIN(x,y,z).

When we now have a boy who is 0.95 teenager, 1.00 interested in games and 0.90 loyal customer he has a total score of 0.90 for Teenager AND Likes Games AND Parents are Loyal Customers. We can then define a limit of 0.5 (in this case, based on some experience of the marketing department) and this boy is included in the marketing while he might have been excluded in a normal segmentation. We don’t know which one of the parameters is somewhat lower than 1, we only know that the total score is higher than 0.5.

Example SQL:

select  firstname,
            when age < 0.5 then 1
            when age between 0.5 and 1 then 1-((age-0.5)/(1-0.5))
            else 0
        end as segment_baby,
            when age between 0.5 and 1 then (age-0.5)/(1-0.5)
            when age between 1 and 3 then 1
            when age between 3 and 4 then 1-((age-3)/(4-3))
            else 0
        end as segment_toddler,
            when age between 3 and 4 then (age-3)/(4-3)
            when age between 4 and 10 then 1
            when age between 10 and 12 then 1-((age-10)/(12-10))
            else 0 
        end as segment_child,
            when age between 10 and 12 then (age-10)/(12-10)
            when age between 12 and 16 then 1
            when age between 16 and 18 then 1-((age-16)/(18-16))
            else 0
        end as segment_teenager,
            when age between 16 and 18 then (age-16)/(18-16)
            when age > 18 then 1
            else 0
        end segment_adult
from    persons