The Basics of potential bot attacks on your website

 

What is Web Scraping?

Web scraping is a method used to extract data from websites. Sometimes called screen scraping, web scraping software may access the web directly using either a web browser or the hypertext transfer protocol. Even though such methods can be carried out manually the term commonly refers to automated an processes via the use of a web crawler or bot. Simply put, it is a form of copying where explicit data is gathered and copied from the web.

Did you know that:

  • Content scraping is the leading use for web scraping?
  • Services that offer web scraping run as low as £1.50 per hour?

 

What is Content Scraping?

Content scraping is the process of copying unique/original content from other websites and publishing it elsewhere. Such practice is illegal as it is carried out without the consent of the original author or source. Typically, such content scrapers copy the content and pass it off as their own.

Content scraping has an adverse effect on the site that has invested time, money, and resources to produce the original content as their web authority ranks and SEO are negatively affected by having duplicate copy elsewhere.

 

What is Price Scraping?

Price Scraping is the process in which bots target the pricing section of a website in order to scrape the pricing data. Typically price scraping is undertaken by online competitors looking to use your pricing against you to gain a competitive advantage. This is particularly unfavorable as it can create the start of a price war.

Which Industries see bot attacks?

In short all industries are at risk but here are some examples. 

Airline/Travel

Whether it be airline tickets through to hotel rooms, to user-generated reviews and unique editorial content, no matter the nature of a travel website, any unique content on a website could be stolen by bots. If a site is not explicitly protected against web scraping, anybody is able to duplicate that content for very little – no investment research or anything else necessary. Such content can then be sold to a competitor, or even used against yourself as a means to steal your organic search traffic. Some pricing scraping is performed by market intelligence companies to provide their data to competitors.

Ecommerce

Online retail has become extremely competitive and unsafe and is under assault constantly by the internet underbelly of malicious online actors, inclusive of big industry competitors. Such threat constituencies are leveraging bad bots in numerous forms that have adverse effects for online retailers. Bad bots scrape prices and product data, carry out click fraud, and endanger the overall security of e-commerce websites, brand reputation as well as customer loyalty. Of all bad bot threats, price scraping and product data scraping are the most costly and rampant to online retailers. 

Online Gaming:

The online gaming and gambling industries is becoming more crowded and as such more competitive. With prices and odds a key component in attracting customers bad bots and price scraping utilized to gain a competitive advantage. 

 

Key terminology: 

  • Price Scraping – Bots target the pricing section of a website in order to scrape the pricing data to share amongst online competitors. Amazon retail has ‘all sorts of “scraping” software’ in order to find the prices of brands online. They also have a whole team ‘dedicated to scraping’.
  • Product Matching – Bots collect a huge number of data points from a site in order to make exact matches against a retailer’s wide variety of products.
  • Product Variation Tracking – Bots are used to scrape product information to a level that accounts for multiple variants.
  • Product Availability Testing – Bots scrape availability data in order to enable competitive positioning against an online retailer’s items products relative to availability and inventory level.
  • Continuous Data Refresh – Bots are deployed on the same online retail site on a regular basis so that buyers of the scraped data are able to react to modifications made by the targeted site.

 

How to prevent Site Scraping:

There are numerous measures that can be taken to manage web scrapers, some more effective than others:

  1. Robot exclusion standard
  2. Manually Block
  3. Web application firewalls (WAF)
  4. Login enforcement
  5. Are you a human?
  6. Geo-fencing
  7. Flow enforcement
  8. Direct bot detection and mitigation

 

Get in touch with us today to find out how we can help you with Bot Mitigation and Web Scraping