Friday, February 13, 2009

Online dating: The technology behind the attraction Page 2

The "scientific" matching services, such as eHarmony (which costs $59.95 for one month, $119.85 for three or $179.70 for six), PerfectMatch and Chemistry.com, attempt to identify the most compatible matches for the user by asking anywhere from a few dozen to several hundred questions. The services then assemble a personality profile and use that against an algorithm that ranks users within a set of predefined categories; from there, the system produces a list of appropriate matches.

Some sites take a hybrid approach. PerfectMatch.com, for example, issues recommended picks but also lets customers browse the "inventory" for themselves.

The technology that powers these dating sites ranges from incredibly simple to incredibly complicated. Unsurprisingly, eHarmony has one of the most sophisticated data centers. Joseph Essas, vice president of technology, says the company stores 4 terabytes of data on some 20 million registered users, each of whom has filled out a 400-question psychological profile (eHarmony's founder is a clinical psychologist).

The company uses proprietary algorithms to score that data against 29 "dimensions of compatibility" -- such as values, personality styles, attitudes and interests -- and match up customers with the best possible prospects for a long-term relationship.

A giant Oracle 10G database spits out a few preliminary candidates immediately after a user signs up, to prime the pump, but the real matching work happens later, after eHarmony's system scores and matches up answers to hundreds of questions from thousands of users. The process requires just under 1 billion calculations that are processed in a giant batch operation each day. These MapReduce operations execute in parallel on hundreds of computers and are orchestrated using software written to the open-source Hadoop software platform.

Once matches are sent to users, the users' actions and outcomes are fed back into the model for the next day's calculations. For example, if a customer clicked on many matches that were at the outset of his or her geographical range -- say, 25 miles away -- the system would assume distance wasn't a deal-breaker and next offer more matches that were just a bit farther away.

"Our biggest challenge is the amount of data that we have to constantly score, move, apply and serve to people, and that is fluid," Essas says. To that end, the architecture is designed to scale quickly to meet growth and demand peaks around major holidays. The highest demand comes just before Valentine's Day. "Our demand doubles, if not quadruples," Essas says.
Online dating site visitors
Snapshot: November 2008

* Total number of visitors to online dating sites: 22,274,000
* Male users: 52.4%
* Female users: 47.6%

Source: comScore Media Metrix

PerfectMatch.com, which claims to have 5 million members, uses a matching algorithm, but its psychological test is shorter than that required by eHarmony. "We wanted to take the basic concept of the Myers-Briggs indicator and apply that to relationships," says Founder and CEO Duane Dahl. The core architecture of the system consists of five front-end Web servers and a large, back-end SQL Server database, plus a variety of servers that handle messaging, marketing and other functions. The matching process is immediate.

True.com also offers "scientific compatibility" matching based on how users answer about 200 questions. The site uses about 200 servers, including a 64-bit, 32-processor Unisys server running Microsoft SQL Server. The matching algorithm's calculations are performed on an array of 64-bit servers that hold a compressed version of the entire multi-terabyte database in memory to facilitate fast matching. "The system can shoot back [matches] with little or no delay," says CEO Vest.

On the other end of the spectrum, Plentyoffish.com's philosophy is to keep it simple. The service focuses on searching and filters: It uses a short questionnaire, and while it does offer some matching capabilities if users want them, CEO Markus Frind says he doesn't promote them -- and he is disdainful of the complex matching algorithms offered by some competitors.

The business operates on just three Web servers, five messaging servers and five database servers (the entire database is just 200GB in size), yet it serves up 200 billion pages a month to some 12 million users. "My entire cost is only a few hundred thousand dollars a year," says Frind. The biggest piece isn't the technology, he says, but the bandwidth required to keep traffic to the site flowing smoothly.

Step 2: From "just looking" to "paying customer"

When it comes to converting users to paid subscribers, the battle is all uphill in an industry in which more than 90% of users never pay a dime. That's where having extensive demographic and psychological data on customers comes in handy.

In fact, online dating sites are so adept at using personal data, potential customers can be forgiven for wondering just who is being "matched up" -- two strangers bent on true love, or lonely customers and the matchmaking site that needs them. (See Online dating: Your profile's long, scary shelf life for details on the ways dating sites mine the data they collect.)

Yahoo Personals uses all of the information at its disposal to tailor its sales pitch to the user. "We try to take advantage of what we know about the user and where they are in their level of engagement with the product," says Ellen Perelman, general manager.

Once users sign up for a free account and fill out a short questionnaire, Yahoo uses targeted messaging to push them through a "conversion tunnel." The messages that users see to persuade them to sign on as paying customers vary depending on the user's profile and his or her behavior on the site.

No comments:

Post a Comment