Andreas Weigend
Social Data Revolution
MS&E 237, Stanford University, Spring 2010

Class 6: People Discovery

Class Date: April 15, 2010
Audio | Transcript
Powerpoint: NA
Paper: NA

What works, what doesn't
Metrics of relevance
Tradeoffs: explicit
Trust: why should ppl trust FB?

4:15 Intro / Summary last class

4:20 User perspective: Discovering ppl, why? Why on FB vs twitter? How is this working for you? POST: Vs MrTweet

4:30 Best of Lars
  • what are the simple insights / one non-super tech paper?
May 2007
  • Able to deduce links using annoymous social graphs via key injection
  • Therefore, no desire to release social graph and have company appear on NYT next day as a result.

4:45 PYMK

P: The problem FB perspective (engagement increase, more relevant content, more ads :)
H: based on prior work on network evolution
  • most links close triangles (A and E know each other, E knows L, A now links with L)
  • what we should do is suggest friends of friends
  • how to pick / data mining / machine learning
A: try different algos for relevance
  • features
  • training strategies - identifying the target goal is important to nail the training strategies
  • bagged decision trees - a simple machine learning algo for training the system
  • performance - real time algo vs offline algo (mix them to get best results)

5:00 ASW
  • 5' exericse on metrics
  • 5' collection
  • CTR on friend
  • total nr of clicks, impression
  • conversion (ie add)
  • unfriending (returns)
  • stockpiling
  • Value of frienships: how to measure (1:1 messages, invites to events, subsequent network growth), incremental increase
  • Social capital delta: for BOTH sides
5:15 Lars:
  • 2 ML systems time scales, one on top of the other
Graphs, showing improvements

5:20 Questions?
5:25 "Take aways"
  • Goals matter (Optimization problem)
  • Ecosystem matters (Cost of irrelevance, Metrics matter)
  • Time scales of computation

5:30 End
Bozhi See
Tal Rusak