Paypal has struggled to contain fraud since the day the company was born. Operating a payments processing system online made it a natural target of hackers and conmen.
Paypal fought back by investing heavily in big data technology for fraud protection. The company began with logistic regression to detect fraud and then advanced to the use of neural networks and Gradient Boosted Trees (GBT) that boosted detection accuracies. Today, Paypal is banking on his accumulation of user data to employ deep learning, active learning, and transfer learning to improve its detection capabilities even further.
Alex Woodie tells us more about Paypal’s battle again fraud in this report published in Datanami:
PayPal has one big advantage over the fraudsters: a huge repository of data. This data repository was important for training classical machine learning models, but it’s absolutely critical for training a deep learning model.
“Over the years we accumulated very, very rich data,” Wang says. “These are really what set us apart in terms of our overall fraud [detection]….We’ve been doing this for a long time so really we have accumulated a lot of data and a lot of intelligence along the way.”
In the second phase of PayPal’s AI journey, Wang and her colleagues might feed somewhere between one million to 10 million data points into the neural network. “I see a clear delta in performance,” Wang says, “but after 20 million [data points], we start to see the performance start to level off…It cannot continue to scale with the amount of data.”
The new deep learning models are voracious consumers of data, which is great, because PayPal has reams of data to feed it. “With deep learning, we noticed that it scales a lot better,” Wang says. “The more data we throw in we see the performance continue to increase, to improve. At PayPal, we never have a lack of data! We have a lot of data. So we are now able to throw hundreds of millions of data [points] into our training properties and we are seeing a clear delta.”
Deep learning is a voracious consumer of something else: processor capacity. Instead of deploying its new deep learning system on traditional CPUs, the company opted to use Nvidia T4 GPUs. Using GPUs allowed PayPal to lower its server capacity by a factor of 8x compared to deploying on traditional CPUs, PayPal’s CTO Sri Shivananda said in a March blog post.
The performance of the deep learning models tracks closely with the amount of horsepower the company puts behind it, Wang says. “The performance improvement is highly corrected with the computing power,” she says. “We see business performance scale with the amount of computing power.”