In 2021, the average breach costs retailers, including those that conduct business online, $3.27 million, an almost 63 percent increase from the $2.01 million it would have cost in 2020. With the rising cost of data breaches, more e-commerce businesses are looking for ways to protect their online stores. IT Business Edge spoke with Benjamin Fabre, co-founder of DataDome to gain insights into how these retailers can safeguard their sites.
Established in 2015, DataDome is a bot protection vendor, meaning they help their clients block bad bots that may want to scrape data or steal personally identifiable information (PII). The platform uses artificial intelligence (AI) and machine learning (ML) to block harmful bots in under two milliseconds.
Jenn Fulmer: The quick shift to e-commerce last year left many retailers vulnerable. What should these organizations look out for on their e-commerce sites?
Benjamin Fabre: We’ve seen a huge increase in traffic on e-commerce websites since the beginning of the pandemic, and that generates amazing opportunities for hackers because most of the quick moves that have been done by e-commerce websites, most of the time, were not very secure because they had to go fast.
What we recommend to our customers is to be protected against automated threats, especially against credential stuffing and account takeover. With the massive password leakage that we’ve seen recently from Yahoo!, Facebook, and LinkedIn, the hackers are using bots in order to automate millions of login password attempts on all e-commerce websites on the planet. Even a small e-commerce website might be threatened by bad bot traffic. A few years ago, bot protection was a “nice to have” and now it’s a “must-have” for e-commerce websites.
Bad actors use bad bots to initiate DDoS attacks on retail websites. How can retailers differentiate between good and bad traffic?
To detect the bot traffic, you have to collect as much information as you can. At DataDome, we are collecting huge volumes of signals for more than one thousand events per day, including how the mouse is moved, how the keyboard is used, and the different touch events performed by a user on a mobile application. That’s a huge volume of signals that have to be collected to determine whether the interaction is done by a human or by a bot.
The second part is to be able to use this information in real time using a low-latency machine learning algorithm. Today, we are able to make a decision in under two milliseconds by collecting this huge volume of information, so we can allow or deny every single request for access to the website or the mobile application. Then, we have to separate the good bot vs. the bad bot. Of course, there are good bots on the internet. For instance, Google has crawlers, and Facebook or Twitter are fetching information from websites to generate nice snippets when you’re sharing the link. So there are many legitimate use cases to run bots.
We have a strong authentication mechanism in place to make sure that the request is really coming from Google. That’s necessary because today we are seeing that 30 percent of the requests coming with “Google Bot” in the name are not coming from the real Google, but they are bad bots that are trying to impersonate the Google reputation. We have developed a strong authentication mechanism to make sure that it’s the real Google, the real Facebook, or the real Twitter, and anyone that is pretending to be Google but is not coming from a Google IP address, for instance, or Google reverse DNS, we will block because we know they are trying to impersonate them.
When we detect that a request is coming from a bad bot, we are trying to understand what the bot is trying to do. Is it just a scraping attempt from a competitor that is trying to fetch the pricing in real time of the e-commerce website to adjust their pricing strategy? Is it a bot that is coming to try a credential stuffing attack or an account takeover to generate some data leakage? Or is it a DDoS attack that is trying to generate some downtime on the website by generating a massive amount of requests?
I’d heard that when you’re doing one of those reCAPTCHA boxes, that the reason you sometimes get let through immediately or sometimes you have to do the verification again is that you moved your mouse in too straight of a line. Is that true?
Yeah, exactly, so behavioral detection will try to adjust the complexity of the reCAPTCHA depending on the trust they have in you, so if you have some legitimate mouse movement, on the page, the page using reCAPTCHA will make it, most of the time, easier, but as soon as you have something a bit weird, that might start to be quite painful. We are trying to avoid the usage of reCAPTCHA because it can kill the user experience for the end-users.
Why are bots such a big threat for online retailers?
There are dozens of different threats related to the bots. The OWASP is trying to classify these threats and some of them are related to the security itself. There are some bots that are doing vulnerability scanning that can try to breach your database and get access to sensitive information. There are many different situations where bots are involved with data leakage that might hurt your business in terms of reputation.
Then, on the business side of the website, when your competitor is able to get in real time all of your pricing strategies, it means you are losing a competitive advantage. Finally, if we are talking about a DDoS attack when a website is down for a few minutes, every second is a direct loss of revenue that sometimes you won’t get back at all.
Also read: 5 Best Practices for Mitigating DDoS Attacks
How does DataDome block these bad bots from flooding a website while allowing good bots, like the ones Google uses, to index a site?
On every single page request or mobile application request, DataDome will collect many signals, and we will let the human go through, we will authenticate the good bots, and we will be able to detect bad bots and prevent them from gaining access to the website. We have deployed our technology in 25 points of presence to be able to be super fast in terms of detection, and we are able to make a decision in under two milliseconds.
Every time someone is trying to reach, for instance, The New York Times’ login section or an e-commerce website like Footlocker, DataDome’s AI will be involved to determine whether the request should be allowed or denied before it reaches our customer’s application and before it can generate serious damage in terms of data.
Aside from partnering with DataDome, what would be your biggest recommendation for online retailers looking to protect their site?
If I had a top three, I would say number one is to use multi-factor authentication when it’s possible on the most sensitive sections, like the login or payment pages.
The second recommendation would be to use a bug bounty program, where white hat hackers will try to find vulnerabilities in your website and share them with you because it’s always better to be found by the good guy than a bad guy. You have to be humble because every website has vulnerabilities, and the best way to find them is to be tested on a daily basis by as much brain power as possible.
The last piece is to have a security expert embedded into any developer team. Some companies might have a team of standard engineers and then a separate security team. What we have done at DataDome is to embed one guy in every single team that’s in charge of security to avoid having two separate mindsets: one in charge of security and one that is trying to go fast and to avoid any security constraints.