Stereotypes (User categories)

7 minute read

Updated:

  1. Concept of a ‘stereotype’
  2. Stereotype: Structure
  3. Limitations of stereotyping
  4. Combining stereotypes
  5. Building stereotypes
  6. Resolving stereotype contradictions

The cold start problem

  • Cold start - If you have insufficient information about the user and cannot initiate the adaptation
    • classic issue in user modelling and adaptation
  • E,g, of insufficient info
    • Amazon – not enough info so it recommends you popular ones
    • When the information about the user is noisy and have to exclude info resulting in limited information diluted information
  • What do you do then?
    • Can narrow down to something that is most related; but how?
    • Can open up and have dialogue with them
    • To make them explicitly rate the item
    • Put user into broader category and assume what they’d like
  • Travel agent
    • Since you are a student and ask if you are venturer, then recommends you what other people in the same category likes

Concept of a ‘stereotype’

  • Was introduced in 1979 with what it is and how it is represented
  • Frame based representation – fundamental underneath
  • Quite intuitive – e.g. someone is a seasoned traveller, teenager, venturer - we categorize and recommend
  • Why?
    • Research challenge coz it hides individual traits and put it into category
    • Getting popular now coz we have better representation of the category – way to handle large amount of users
    • We don’t have time to learn about each individual
  • How to handle the stereotype
    1. How to represent
    2. How are we gonna build it – info about the user, how to assign a category
    3. How to maintain the category

Stereotype: definition

  • Stereotype is a knowledge structure that represents frequently occurring characteristics of users
    • The large amt of data about different users and past users, will allow us to mine that data and identify the frequency occurring characteristics
    • There is rating of possibilities
  • We can infer or assign plausible inferences – cannot be sure for 100% but since they are common category, you are making plausible inference
  • We have small number of observations about the user – we know little about the user
  • E.g. about her lecture, she knows where people struggle and try to adapt to it

Scenario: Travel information system

  • We have a simple explicit user model
  • User registers and tells very little about themselves about their demographic information
    • We are not asking about interest, destination, kind of hotels etc
    • small number of observations
  • Then, we will be looking at categories
    • Married with kids
    • Professional
    • Students
  • We will assume that our system has a way to adapt to each category – and how from this register information how we gonna assign and give users recommendation

Stereotype: Structure

  • Body, aka the frame
    • Information that is typically true for users to whom the stereotype applies
      • Facets - characteristics
      • Values – to quantify / qualify the facets
      • Ratings – degree of certainty for facet-value pairs
  • Trigger
    • Logical condition that has to be satisfied for the user to be in that category
      • E.g Age of >20 <60
    • Occurrence of events which signal appropriateness of particular stereotypes – i.e. can associate a user with a stereotype
    • Can add a probability value against it – if the user satisfies the trigger, what’s the prob of user entering the stereotype
    • How the probability is defined:
      • Either ask an expert
      • Look at the dataset and calculate the frequency
  • Relations
    • Between the stereotype and other stereotypes in the system (for more complex models)

Stereotype: Professional

  • Screenshot 2020-10-26 at 3 47 14 pm
  • It is a frame-based representation
  • The entire thing is the body – facets, values, ratings
    1. Facets corresponds to items you are recommending
    2. Each facet has values – can have one or more values that are relevant
      • Facet ‘entertainment’ will have lot of values – but you ONLY put values that are relevant your category
    3. Third stereotype body is rating
      • Rating comes from frequency of the past users OR
      • If we talk about travel assistance, we can ask them
  • Trigger
    • Logical condition that has to be satisfied for the user to be in that category
    • Little about the user – demographics, and you want to pull out a lot to suggest plausible inferences
    • (age>20) and age <60, and job in ListProgessionalJobs)
    • Not a cold start anymore
    • P = 0.2 - probability value against it – if the user satisfies the trigger, what’s the prob of user entering the stereotype

Limitations of stereotyping

  • May fall in more than one stereotype and can contradict each other
  • Bc by putting it in the category, **individual users might not fit into that category – fundamental challenge
  • Having predefined categories; user may not fit in any of it – excludes subset of the user

Combining stereotypes inferring a user model

  • profile -> stereotypes -> inferred user model
  • In our system about traveller system we have very little information about the user -> we decide which trigger -> decide stereotype -> from several stereo we combine and come up with an inferred user model
  • Assume we have a specific user: Charlie, 27, single, IT consultant.
    • -> He satisfied stereotype of: ‘professional’ and ‘male’
  • Screenshot 2020-10-26 at 3 48 09 pm
    • Charlie falls into two categories
  • If the stereotype falls into more than one category:
    • Screenshot 2020-10-26 at 3 49 49 pm
    • we combine the two using the ‘union rule’

Probability when in one stereotype

  • Pstereotype(feature=value) = P (feature = value|stereotype) * P(stereotype)

Combining stereotypes

  • Use probability of union rule !!
    • P (A∪B) = P(A) + p(B) - p(A*B) = union
  • E.g. How likely does Charlie like to travel long distance?- distance ‘long’
    • Stereotype A = Man, B = Professional
    • P (distance = ‘long’) = P(distance=‘long’ man) +
      P(distance=‘long’ Professional) – P( man*prof)
    • = (0.5*0.8) + (0.8*0.7) – (0.4*0.56) = 0.4+0.56 – 0.24 = 0.72

Building stereotypes

  • But then where does this stereotype come from???????
  • What is the stereotype? The trigger w probability and the body. Initially this is how stereo was built.
  1. Asking experts
    • Asking humans who have experienced these users – who have experience with the users
      • e.g. travel agents, teachers
    • Ask several diff experts
      • - But the challenge here is they may come w diff facets, value and ratings
    • If they are suggesting diff facets, need to decide on which based on your
      • Majority voting
      • Priorities – more experienced experts rated higher
    • Seek consensus if there are several diff experts
  2. Obtaining stereotypes form media
    • Use the data about the user that is available to build stereotypes
    • Decide the attributes to use – what can you measure about the user.
    • Identifies meaningful clusters based on the selected attributes
      • What stereotypes look like, is a cluster – a group of users form the data that exhibits similar behaviour - method used here is clustering
    • Analyse the cluster – most crucial!
      • Within the cluster what are the characteristics of the people that fall into that cluster, that identifies the cluster? What may be the interesting things about the user?
      • Which of those variables differ across different clusters?
      • Then based on that define the characteristics of the cluster, find facets and values
    • Form the stereotypes need to bring in the human
      • To decide if it is meaningful or not cluster
      • To name the cluster based on the characteristics

Resolving stereotype contradictions

  • image
  • Can have general vs specific
    • E.g. Undergrad student and student
    • Specific can inherit from generic
  • BUT what if there’s a contradiction?
    • If its specific, generic you take the specific. This provides more insight for that group
    • Rule of thumb: decide priority. Then exclude other
    • Make the user choose which category they consider themselves as
    • Why not take the union?
      • By default you can, but you won’t please either of them. Not relevant to nowhere

Stereotype tuning

  • We tune rating and the values
  • IF
    • If evidence shows that the predictions based on stereotypes were correct –> should be preserved for the future
    • NOT correct should be amended in the future
  • You are preserving what’s happening, and you are constantly recommending something
  • Need to think of what you are tuning
    • Have to tune trigger
    • Rating
    • Facet
  • How? Done by asking human AND data as you learn from historic data, so you can look at new data

Summary of stereotypes

Advantages Disadvantages
  • Quick and fairly straightforward method for making assumptions based on limited observations

  • Applicable and used in a variety of domains

  • Quick and easy to implement

  • Allows dealing with cold start – broad user profile; direct connection with classification algorithms

  • Reliability – facet values and ratings are critical and are subjectively assigned tuning algorithms are crucial

  • Individual differences not properly catered for

  • Ad-hoc approach to resolving contradiction problems

  • Can lead to bias, if obtained from under-represented data

Leave a comment