When Search Engine Optimization (SEO) has become critical for the success of any organization in Dubai, it is important to prevent SEO competitors from stealing your hard earned backlinks. When competitors use backlink checker tools over your website, they are not only stealing your valuable  backlinks but are also slowing down your server using the bots and wasting your bandwidth. Such backlink checker tools cannot be blocked through robots.txt directives as they do not obey the rules. For example we were annoyed with the Ahrefs bot frequently crawling our website and slowing it down. We added the below 2 lines of code in our website’s robots.txt to prevent it from crawling our site.

User-agent: AhrefsBot
Disallow: /

But still it wouldn’t stop and we had to raise a request to Ahrefs support team. They ensured that this will stop as they had updated it but still the issue wasn’t resolved. So we thought “Why shouldn’t we block the bot completely from even visiting our website?”. We checked our access logs and did some research to build up a list of bad bots so we can block them completely.

block SEO competitors from stealing backlinks

Below is a useful piece of code which you can enter in your website’s root directory’s .htaccess file to stop your SEO competitors from moving further by using your backlinks. This code will check for user agent strings matching with bad bots and will forward them to Google.com

Code :

RewriteEngine On
RewriteCond %{REQUEST_URI} !/robots.txt$
RewriteCond %{HTTP_USER_AGENT} ^.*BLEXBot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*BlackWidow.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Nutch.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Jetbot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebVac.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Stanford.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*scooter.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*naver.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*dumbot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Hatena\ Antenna.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*grub.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*looksmart.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebZip.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*larbin.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*b2w/0.1.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Copernic.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*psbot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Python-urllib.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*NetMechanic.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*URL_Spider_Pro.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*CherryPicker.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*EmailCollector.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*EmailSiphon.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebBandit.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*EmailWolf.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Email.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*ExtractorPro.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*CopyRightCheck.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Crescent.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*SiteSnagger.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*ProWebWalker.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*CheeseBot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*LNSpiderguy.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*ia_archiver.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Alexibot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Teleport.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*MIIxpc.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Telesoft.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Website\ Quester.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*moget.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebStripper.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebSauger.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebCopier.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*NetAnts.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Mister\ PiX.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebAuto.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*TheNomad.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WWW-Collector-E.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*RMA.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*libWeb/clsHTTP.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*asterias.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*httplib.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*turingos.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*spanner.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Harvest.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*InfoNaviRobot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Bullseye.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebBandit.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*NICErsPRO.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Microsoft\ URL\ Control.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*DittoSpyder.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Foobot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebmasterWorldForumBot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*SpankBot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*BotALot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*lwp-trivial.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebmasterWorld.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*BunnySlippers.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*URLy\ Warning.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Wget.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*LinkWalker.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*cosmos.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*hloader.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*humanlinks.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*LinkextractorPro.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Offline\ Explorer.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Mata\ Hari.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*LexiBot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Web\ Image\ Collector.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*The\ Intraformant.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*True_Robot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*BlowFish.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*SearchEngineWorld.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*JennyBot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*MIIxpc.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*BuiltBotTough.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*ProPowerBot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*BackDoorBot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*toCrawl/UrlDispatcher.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebEnhancer.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*suzuran.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebViewer.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*VCI.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Szukacz.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*QueryN.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Openfind.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Openbot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Webster.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*EroCrawler.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*LinkScan.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Keyword.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Kenjin.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Iron33.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Bookmark\ search\ tool.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*GetRight.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*FairAd\ Client.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Gaisbot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Aqua_Products.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Radiation\ Retriever\ 1.1.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Flaming\ AttackBot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Oracle\ Ultra\ Search.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*MSIECrawler.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*PerMan.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*searchpreview.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*sootle.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Enterprise_Search.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Bot\ mailto:craftbot@yahoo.com.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*ChinaClaw.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Custo.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*DISCo.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Download\ Demon.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*eCatch.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*EirGrabber.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*EmailSiphon.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*EmailWolf.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Express\ WebPictures.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*ExtractorPro.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*EyeNetIE.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*FlashGet.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*GetRight.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*GetWeb!.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Go!Zilla.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Go-Ahead-Got-It.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*GrabNet.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Grafula.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*HMView.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*HTTrack.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Image\ Stripper.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Image\ Sucker.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Indy\ Library.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*InterGET.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Internet\ Ninja.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*JetCar.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*JOC\ Web\ Spider.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*larbin.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*LeechFTP.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Mass\ Downloader.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*MIDown\ tool.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Mister\ PiX.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Navroad.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*NearSite.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*NetAnts.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*NetSpider.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Net\ Vampire.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*NetZIP.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Octopus.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Offline\ Explorer.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Offline\ Navigator.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*PageGrabber.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Papa\ Foto.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*pavuk.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*pcBrowser.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*RealDownload.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*ReGet.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*SiteSnagger.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*SmartDownload.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*SuperBot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*SuperHTTP.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Surfbot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*tAkeOut.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Teleport\ Pro.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*VoidEYE.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Web\ Image\ Collector.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Web\ Sucker.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebAuto.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebCopier.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebFetch.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebGo\ IS.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebLeacher.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebReaper.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebSauger.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Website\ eXtractor.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Website\ Quester.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebStripper.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebWhacker.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WebZIP.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Wget.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Widow.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*WWWOFFLE.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Xaldon\ WebSpider.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Zeus.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Semrush.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*BecomeBot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*AhrefsBot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*MJ12bot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*rogerbot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*exabot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*Xenu.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*dotbot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*gigabot.*$ [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^.*BlekkoBot.*$ [NC]
RewriteRule ^.*.* http://www.google.com [L]

 

And here is big list of destructive bots which you may block without even thinking twice:

AhrefsBot
Alexibot
Aqua_Products
asterias
b2w/0.1
BackDoorBot/1.0
BecomeBot
BlekkoBot
BlowFish/1.0
Bookmark search tool
BotALot
BuiltBotTough
Bullseye/1.0
BunnySlippers
CheeseBot
CherryPicker
Copernic
CopyRightCheck
cosmos
Crescent
Crescent Internet ToolPak HTTP OLE Control v.1.0
DittoSpyder
dotbot
dumbot
EmailCollector
EmailSiphon
EmailWolf
Enterprise_Search
EroCrawler
es
exabot
ExtractorPro
FairAd Client
Flaming AttackBot
Foobot
Gaisbot
GetRight/4.2
gigabot
grub
grub-client
Harvest/1.5
Hatena Antenna
hloader
Pubcon | Pubcon Search, Social Media, Affiliate Marketing Conferences bot
WebmasterWorld News and Discussion for the Web Professional bot
httplib
humanlinks
ia_archiver
InfoNaviRobot
Iron33/1.0.2
JennyBot
Jetbot
Kenjin Spider
Keyword Density/0.9
larbin
LexiBot
libWeb/clsHTTP
LinkextractorPro
LinkScan/8.1a Unix
LinkWalker
LNSpiderguy
looksmart
lwp-trivial
lwp-trivial/1.34
Mata Hari
Microsoft URL Control
MIIxpc
Mister PiX
MJ12bot
moget
MSIECrawler
naver
NetAnts
NetMechanic
NICErsPRO
Nutch
Offline Explorer
Openbot
Openfind
Openfind data gathere
Oracle Ultra Search
PerMan
ProPowerBot/2.14
ProWebWalker
psbot
Python-urllib
QueryN Metasearch
Radiation Retriever 1.1
RepoMonkey
RepoMonkey Bait & Tackle/v1.01
RMA
rogerbot
scooter
searchpreview
Sitebot
SiteSnagger
sootle
SpankBot
spanner
Stanford
Stanford Comp Sci
Stanford CompClub
Stanford CompSciClub
Stanford Spiderboys
suzuran
Szukacz/1.4
Teleport
TeleportPro
Telesoft
Teoma
The Intraformant
TheNomad
toCrawl/UrlDispatcher
True_Robot
turingos
URL Control
URL_Spider_Pro
URLy Warning
VCI
VCI WebViewer VCI WebViewer Win32
Web Image Collector
WebAuto
WebBandit
WebCopier
WebEnhancer
WebmasterWorld Extractor
WebmasterWorldForumBot
WebSauger
Website Quester
Webster Pro
WebStripper
WebVac
WebZip
Wget
WWW-Collector-E
Xenu
Xenu’s
Zeus

Make it a habit to check your access logs regularly to filter out annoying bots and append your .htaccess file with code like RewriteCond %{HTTP_USER_AGENT} ^.*SOME-BAD-BOT.*$ [NC,OR] to block that bot. This will help you sleep without worrying about people who steal backlinks!