While our media landscape has changed dramatically over the years as new constituents continually present novel ways to deliver information, our "core media values" have not changed at all. These core values are derived from the fundamental need for humans to communicate:
  • Collaboration - from whiteboards to MediaWiki
  • Discussion - from town halls to GoogleTalk
  • Publishing - from printing presses to WordPress
  • Sharing - from fireside story telling to Facebook
  • Broadcasting - from public radio to Twitter
external image moz-screenshot.png


As the above list suggests, the tools and methods we use to communicate and incorporate data into our day to day discussions has most definitely changed, even if the core media values have not:
  • 1970s - ~10M "experts" using computers; enterprise use primarily consists of digitizing "backend" tasks such as accounting, payroll, etc.
  • 1980s - ~100M people using computers; enterprise begins to incorporate digital data into more "frontend" tasks such as airline customer service
  • 1990s - ~1B people using computers; customers can now interact directly with firms
  • 2000s - ~1B people using computers, ~100M people producing content and collaborating through computers
  • 2010s - new emphasis on content discovery, social networking, and mobile data

This rapid alteration of the landscape is astonishing, and one can see evidence of such changes in numerous places. From 2005-2010, the Huffington Post, a blog based news provider, has managed to become the second most popular online news provider in terms of monthly unique visitors, second only to the more traditionally distributed New York Times [citation needed]. One can also point to the advent of the blogosphere, which surveys indicate is fueled by individuals' desires to express themselves and create permanent personal records of themselves [EMarketer], which have never been easier to do. In enterprise, corporations have quickly come to realize that their customers can in fact add value to products and services through crowdsourcing and open platforms. A great example is the Google Map API, which was made open after a "hacker" assisted in improving some of the service's internal code [citation needed].

All of this revolves around the central new paradigm that products and services can easily continually improve themselves through the use of social media. Rather than depreciate in value over time like an automobile or personal computer, countless products and services get infinitely better the more customers use them. The notion that every "poke" or interaction by a customer creates usable data has and will continue to create new markets consisting of these "evolvable" products and services.

This of course raises questions: Where does it stop? Do we continue to digitize every human interaction possible? What do we lose over time as the media landscape continues to evolve? How social is considered TOO social?


What is the value of the data? A general rule of thumb is that the value of your data should be proportional to the significance of the decision your buyer has to make. The "significance" of a decision is dictated by things like potential increase in sales, amount of mitigated risk, time savings, and the like.


  • Class tended to do more social graph analysis, rather than content analysis (attributed to difficulty level of extracting data from content)
  • Hashtag analysis didn't seem to improve results very much (attributed to sparsity of hashtag use)
  • Difference between newer users and established users became quite clear in the assignment, emphasizing that one should know cater his or her recommendations to the appropriate target audience


It is useful to orient thoughts regarding technological innovation around three pillars:
- Data: Is data collection manual or automated? What sources are trustworthy/useful? Is there data that you're neglecting to collect?
- Methodology: What algorithms are you using? What kind of time/memory requirements are your methodologies imposing? Is your methodology self-sustainable? How are you removing noise from your data? How are you dealing with uncertainty?
- Domain expertise: How well do you know your target audience? What do your customers want, and what can you give? What don't your customers understand? What role does pricing play? What cultural factors need to be accounted for? What are customers STATING, and what are they REVEALING?



The concept of A/B testing can be boiled down to a simple control test, and has been in practice for centuries. A simple example from the 1700s was determining a remedy for scurvy on naval vessels. Half of the affected sailors were provided limes in their diet, while the other half were not. It became clear that limes had a positive affect on scurvy sufferers when the respective set of sailors showed improvements.

A/B testing generally occurs in three phases:
- Test definition
- Metric definition
- Experimentation and iteration


When defining an A/B test, the tester is usually looking for possible improvements in a product or service. By carefully selecting a pair of possibilities that compete with each other, the tester can present these to a test subject to determine which one is most preferable. Usually, the test subject is unaware that he or she is being tested so as to remove possible biases. A web-centric example would be the decision of placing a navigation bar on the left or right hand side of a page.


After defining the test, the metrics that are gathered during testing must be thoughtfully chosen. Testers devote a significant amount of time to assessing the reliability of various data sources and what correlations are most relevant to the test at hand. For instance, a tester could define metrics obtained through the browser a test subject uses (i.e. browsing history), or metrics could alternatively be obtained through the subject's direct interactions with a web page (i.e. clicks). The tester must also be observant of "just noticeable differences" (JNDs) that may be lost in noisy data, and select metrics that can strongly isolate trends over time.


After assessing metrics, one is then faced with a decision of how to incorporate this feedback. A general rule of thumb is to incorporate only "relevant" results; if metrics indicate that incorporating A or B had less than a 1% impact on the results, then the test should not suggest any corrective action. By rapidly iterating over this process and continually selecting pairs of features to exercise, products and services often benefit from drastic improvements without having to directly query their users.

Nick Hwang
Wan Jing Loh