How data and technology will democratize big internet’s secret weapon

Photo via: CNN

By Jason Tan,

The patent for Amazon’s “Buy now with 1-click®” technology expires in September of this year. Launched in 1997, 1-click set a new standard for online retail, generating insane growth and creating incredible customer loyalty. The expiration of the technology patent, which up until now Amazon had licensed only to Apple, is sure to set off a rush for big e-commerce and payments companies to create their own one-click payments experiences.

But it’s not just big companies that can capitalize on this opportunity. To me, Amazon one-click checkout has served as an inspiration for what trust can enable. But what powers that trust? Data. With the explosion of data, and rapid advances in technology, we’re reaching a point where any company can take down the barriers to frictionless payments. Over the next several years, we’ll see the democratization of the one-click buying experience, based on automated ways to establish trust.

The invisible fabric

What is trust? It’s the invisible fabric that enables people to do things for each other, and together with each other. When you have a friend stay over, you trust that they’re not going to damage your place, steal something, or hurt you. When you get into a taxi, Uber or Lyft, you trust that the driver is going to safely take you where you want to go. When you hop on a plane, you trust that the mechanics have done their job to check every potential system to prevent failure. Without trust, nothing happens.

Online, trust lets you open the doors to most of the user base, and close them only to people when they’re risky. You want to let known good actors through with a minimum of friction, just like the way you’d treat a VIP at a nightclub. Where there’s less trust, there’s more verification, and where there’s no trust, outright rejection.

On most of the internet today, it’s more the opposite. We start with a blanket layer of mistrust. Payments are rife with fraud, so everyone has to type in their full credit card details. They have to have their phone handy to do two-factor authentication to get into their bank account. They have to remember their password and PIN to talk on the phone to a customer support person about their cell phone account.

This is all very inefficient. It’s an airport security type experience, where you’re considered risky until proven otherwise, creating a huge inconvenience for everyone. Really, we all deserve a TSA-free experience, except for a few.

Neighborhood Watch for the internet

If you have enough data and history with a customer, you can make a much more intelligent assessment of their level of risk. Amazon, with its huge store of transactional data and vast engineering resources, was able to apply technology to figuring out the right level of verification to protect itself and its merchants against bad actors without inconveniencing the good ones.

Their data will remain proprietary, though their technology may no longer be. But with open networks that collect and share data in a many-to-many way, we can effectively build a kind of “Neighborhood Watch” for the internet that multiple companies can use. As we add more and more data to the network, and apply machine learning and deep learning to that data, we get smarter together.

This is a scalable way to create trust. What we’ve had up until recently are primarily rules-based systems that are reactive and require a lot of human intervention to maintain. For example, it’s known that the more numbers you have in your email address, the more likely you are to be a fraudster, because fraudsters create new email addresses programmatically, such as,, and

However, my personal email has three numbers because I used my birthday – so in a rules-based system I might be considered risky just based on that. But there are far more non-fraudulent people who have numbers in their email address than fraudsters, so to avoid blocking all of them, you need to create another rule.

You quickly go down this rabbit hole where you’re constantly playing whack-a-mole and creating a patchwork of all these different rules. Humans have to be involved, because rules-based systems are just not that accurate. They tend to look at the world in black and white, and that’s not how the world is. The world is probabilistic. Black swans happen.

Keeping up with the fraudsters

As fraudsters evolve their tactics and continue to increase the scale, speed and sophistication at which they operate, those systems are yesterday’s news. You need to have systems that can evolve with the fraudsters. Software that rapidly applies statistical models to a growing body of data minimizes human biases and overcomes the limitations of human ability.

Email spam filters are a good example of that evolution. Twenty years ago, email spam filters were very rules driven, and a lot of legitimate mail got sent to the spam folder. Modern email spam filters today are driven by machine learning systems that are able to automatically classify good emails from spam emails and move things into one bucket or another.

You have to have machine learning, because there are too many interconnected variables for a rules-based system. For example, you can analyze so many different things about an email address. What’s the ratio of consonants to vowels? A lot of fraudsters don’t use real names–they just type ‘qwerty’ or a bunch of random gibberish. Does the person’s name match the email address? If my name was John Doe, and my email address said Jason Tan, that could be suspicious.

What is the format of the billing address? Is “Street” spelled out, or written as “st?” We learned from our data that “st” tends to see more fraud than “Street,” probably because it’s faster to type that way. What is the email service provider? We know from our data that there’s actually less fraud on Gmail than Hotmail or other email domains.

Why? We don’t know. But it doesn’t really matter why. Humans want to know why, but computers don’t care about that. All these small data points are signals, each one offering a single clue. The computer is like Sherlock Holmes at scale, compiling the statistics and determining the probability that any given transaction might be fraudulent. By pulling all the clues together and weighing each piece of evidence, it tells a bigger story.

Humans only have so much capacity to do this type of analysis. Computers today can crunch billions and billions of numbers in seconds or less. Sometimes the answers are wrong—just look at Google Maps–but the more data points you bring in, the greater likelihood you have of getting to the right answers. You just keep building and refining your models, based on how the world is operating.

In data we trust

Trust is one of the most, if not the most important competitive advantage in any business. As Stephen Covey puts it, “business moves at the speed of trust.” If you can use trust to serve your customers faster, that’s going to give you an edge, as Amazon’s one-click payments proved. But underlying that delightful user experience is an incredible amount of data and technology being used to make a split-second risk assessment and decision about who to trust.

Not every company can build this themselves, nor do they want to. As Amazon’s patent expires and one-click type shopping experiences spread across the internet, they still need to have a way to provide a similar experience in order to compete.

By embracing data, and turning this part of the business over to software and open networks, they can leverage this collective intelligence, the so-called wisdom of the crowd, to have very high-quality decision making about who they can open their doors to, because no one should have a monopoly on trust.