As you can see, when I do a reverse IP lookup for 50.116.69.213 the returned hostname ends with .bluehost.com. 50.116.69.213 is clearly not a real Bingbot.
Typically I would block anyone falsely claiming to be a Bingbot. But given that it only polls FreeFixer XML PAD, it’s possible that 50.116.69.213 is running a software listing site. Setting the user agent to Bingbot could just be sloppy programming error. I will not block 50.116.69.213.
Found another hacking attempt this morning when examining the access.log. I’ve pasted the requests from 124.156.120.3 below. It appears attempt to inject some PHP and SQL code. In addition 124.156.120.3 also identify itself as Bingbot, which obviously is not true.
124.156.120.3 seems to be assigned to Singapore Tencent Cloud Computing (beijing) Co. Ltd. It’s likely one of their customers that have been hacked. Here’s the location on a Google map:
Recently I’ve been keeping an eye on the traffic at FreeFixer.com and trying to block fake Bing and Google bots and other types of bad behaviour. This morning I found a bunch of hacking attempts from 106.52.197.96, which by they way appears to be located on the Tencent cloud computing (Beijing) Co., Ltd network range. I’m guessing one of Tencent’s cloud clients got hacked.
I’ve posted the requests below. 106.52.197.96 attempts to inject some PHP and SQL code. For obvious reason, this IP will be blocked in .htaccess.
Unfortunately, MegaIndex.ru’s crawler page does not clearly explain why it is a good idea to let the bot crawl my site. Nor does it clearly explain what user User-agent name to use in order to explicitly delay or disallow the bot. In addition to this, it sounds as the maximum value for Crawl-delay is 5 seconds.
I hope that the 5 seconds max value is just a typo. I’m going to try to slow down or block the bot with the following entries in robots.txt:
User-agent: MegaIndex.ru
Crawl-delay: 3600
or
User-agent: MegaIndex.ru
Disallow: /
I did a reverse IP lookup on 144.76.27.118 and the bot is running at clients.your-server.de.
I tried a few geolocation services and all report that the 144.76.27.118 server is located in Germany.
Update 24 hours later: When checking the logs again I noticed that MegaCrawler had done more than 6000 requests. That is unacceptable. I’m blocking 144.76.27.118 in the .htaccess file.
You can add 66.249.79.159 to your whitelist right away. 66.249.79.159 belongs to Google and the bot operating from there is the real GoogleBot.
If you prefer to verify that 66.249.79.159 belongs to Google, you can launch a command shell and do a reverse IP lookup on 66.249.79.159 and then a forward DNS lookup on the host name returned from the reverse lookup:
As you can see the reverse lookup returns a .googlebot.com address, and the forward DNS requests brings us back with 66.249.79.159. We can now conclude that 66.249.79.159 is a real Googlebot.
For the last days I’ve been going through all traffic on the Freefixer.com web site. My goal is to reduce the traffic to the web site by blocking a bunch for uninvited bots and crawlers. I’ll try to share some of the result here and I hope you’ll find it useful.
Found a log entry from 142.252.249.27 this morning:
142.252.249.27 - - [09/Sep/2019:07:59:18 -0700] "HEAD /backup.zip HTTP/1.1" 404 4128 "http://www.freefixer.com/backup.zip" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"
The bot running at 142.252.249.27 is scanning freefixer.com for backups, databases, data, code, bitcoin wallets, bitcoin cash wallets, litecoin wallets, dogecoin wallets, etc. It looks for various file formats such as .zip, .rar, .dat, .7z, .sql, .mdb, .mdf, .tgz, .tar and .sql. Here’s the complete lite of requests that 142.252.249.27 did:
HEAD /backup.zip
HEAD /backup.rar
HEAD /backup.dat
HEAD /backup.7z
HEAD /backup.sql
HEAD /backup.mdb
HEAD /backup.mdf
HEAD /backup.tgz
HEAD /backup.tar.gz
HEAD /db.zip
HEAD /db.rar
HEAD /db.dat
HEAD /db.7z
HEAD /db.sql
HEAD /db.mdb
HEAD /db.mdf
HEAD /db.tgz
HEAD /db.tar.gz
HEAD /web.zip
HEAD /web.rar
HEAD /web.dat
HEAD /web.7z
HEAD /web.sql
HEAD /web.mdb
HEAD /web.mdf
HEAD /web.tgz
HEAD /web.tar.gz
HEAD /database.zip
HEAD /database.rar
HEAD /database.dat
HEAD /database.7z
HEAD /database.sql
HEAD /database.mdb
HEAD /database.mdf
HEAD /database.tgz
HEAD /database.tar.gz
HEAD /data.zip
HEAD /data.rar
HEAD /data.dat
HEAD /data.7z
HEAD /data.sql
HEAD /data.mdb
HEAD /data.mdf
HEAD /data.tgz
HEAD /data.tar.gz
HEAD /web.zip
HEAD /web.rar
HEAD /web.dat
HEAD /web.7z
HEAD /web.sql
HEAD /web.mdb
HEAD /web.mdf
HEAD /web.tgz
HEAD /web.tar.gz
HEAD /wwwroot.zip
HEAD /wwwroot.rar
HEAD /wwwroot.dat
HEAD /wwwroot.7z
HEAD /wwwroot.sql
HEAD /wwwroot.mdb
HEAD /wwwroot.mdf
HEAD /wwwroot.tgz
HEAD /wwwroot.tar.gz
HEAD /www.zip
HEAD /www.rar
HEAD /www.dat
HEAD /www.7z
HEAD /www.sql
HEAD /www.mdb
HEAD /www.mdf
HEAD /www.tgz
HEAD /www.tar.gz
HEAD /code.zip
HEAD /code.rar
HEAD /code.dat
HEAD /code.7z
HEAD /code.sql
HEAD /code.mdb
HEAD /code.mdf
HEAD /code.tgz
HEAD /code.tar.gz
HEAD /test.zip
HEAD /test.rar
HEAD /test.dat
HEAD /test.7z
HEAD /test.sql
HEAD /test.mdb
HEAD /test.mdf
HEAD /test.tgz
HEAD /test.tar.gz
HEAD /admin.zip
HEAD /admin.rar
HEAD /admin.dat
HEAD /admin.7z
HEAD /admin.sql
HEAD /admin.mdb
HEAD /admin.mdf
HEAD /admin.tgz
HEAD /admin.tar.gz
HEAD /user.zip
HEAD /user.rar
HEAD /user.dat
HEAD /user.7z
HEAD /user.sql
HEAD /user.mdb
HEAD /user.mdf
HEAD /user.tgz
HEAD /user.tar.gz
HEAD /sql.zip
HEAD /sql.rar
HEAD /sql.dat
HEAD /sql.7z
HEAD /sql.sql
HEAD /sql.mdb
HEAD /sql.mdf
HEAD /sql.tgz
HEAD /sql.tar.gz
HEAD /wallet.zip
HEAD /wallet.rar
HEAD /wallet.dat
HEAD /wallet.7z
HEAD /wallet.sql
HEAD /wallet.mdb
HEAD /wallet.mdf
HEAD /wallet.tgz
HEAD /wallet.tar.gz
HEAD /wallet.backup.zip
HEAD /wallet.backup.rar
HEAD /wallet.backup.dat
HEAD /wallet.backup.7z
HEAD /wallet.backup.sql
HEAD /wallet.backup.mdb
HEAD /wallet.backup.mdf
HEAD /wallet.backup.tgz
HEAD /wallet.backup.tar.gz
HEAD /litecoin.zip
HEAD /litecoin.rar
HEAD /litecoin.dat
HEAD /litecoin.7z
HEAD /litecoin.sql
HEAD /litecoin.mdb
HEAD /litecoin.mdf
HEAD /litecoin.tgz
HEAD /litecoin.tar.gz
HEAD /Litecoin.zip
HEAD /Litecoin.rar
HEAD /Litecoin.dat
HEAD /Litecoin.7z
HEAD /Litecoin.sql
HEAD /Litecoin.mdb
HEAD /Litecoin.mdf
HEAD /Litecoin.tgz
HEAD /Litecoin.tar.gz
HEAD /Bitcoin.zip
HEAD /Bitcoin.rar
HEAD /Bitcoin.dat
HEAD /Bitcoin.7z
HEAD /Bitcoin.sql
HEAD /Bitcoin.mdb
HEAD /Bitcoin.mdf
HEAD /Bitcoin.tgz
HEAD /Bitcoin.tar.gz
HEAD /bitcoin.zip
HEAD /bitcoin.rar
HEAD /bitcoin.dat
HEAD /bitcoin.7z
HEAD /bitcoin.sql
HEAD /bitcoin.mdb
HEAD /bitcoin.mdf
HEAD /bitcoin.tgz
HEAD /bitcoin.tar.gz
HEAD /HShare.zip
HEAD /HShare.rar
HEAD /HShare.dat
HEAD /HShare.7z
HEAD /HShare.sql
HEAD /HShare.mdb
HEAD /HShare.mdf
HEAD /HShare.tgz
HEAD /HShare.tar.gz
HEAD /btc.zip
HEAD /btc.rar
HEAD /btc.dat
HEAD /btc.7z
HEAD /btc.sql
HEAD /btc.mdb
HEAD /btc.mdf
HEAD /btc.tgz
HEAD /btc.tar.gz
HEAD /bch.zip
HEAD /bch.rar
HEAD /bch.dat
HEAD /bch.7z
HEAD /bch.sql
HEAD /bch.mdb
HEAD /bch.mdf
HEAD /bch.tgz
HEAD /bch.tar.gz
HEAD /btm.zip
HEAD /btm.rar
HEAD /btm.dat
HEAD /btm.mdb
HEAD /btm.mdf
HEAD /btm.tgz
HEAD /btm.tar.gz
HEAD /bcd.zip
HEAD /bcd.rar
HEAD /bcd.dat
HEAD /bcd.7z
HEAD /bcd.sql
HEAD /bcd.mdb
HEAD /bcd.mdf
HEAD /bcd.tgz
HEAD /bcd.tar.gz
HEAD /bcx.zip
HEAD /bcx.rar
HEAD /bcx.dat
HEAD /bcx.7z
HEAD /bcx.sql
HEAD /bcx.mdb
HEAD /bcx.mdf
HEAD /bcx.tgz
HEAD /bcx.tar.gz
HEAD /qianbao.zip
HEAD /qianbao.rar
HEAD /qianbao.dat
HEAD /qianbao.7z
HEAD /qianbao.sql
HEAD /qianbao.mdb
HEAD /qianbao.mdf
HEAD /qianbao.tgz
HEAD /qianbao.tar.gz
HEAD /doge.zip
HEAD /doge.rar
HEAD /doge.dat
HEAD /doge.7z
HEAD /doge.sql
HEAD /doge.mdb
HEAD /doge.mdf
HEAD /doge.tgz
HEAD /doge.tar.gz
HEAD /dogecoin.zip
HEAD /dogecoin.rar
HEAD /dogecoin.dat
HEAD /dogecoin.7z
HEAD /dogecoin.sql
HEAD /dogecoin.mdb
HEAD /dogecoin.mdf
HEAD /dogecoin.tgz
HEAD /dogecoin.tar.gz
HEAD /backup.zip
HEAD /backup.rar
HEAD /backup.dat
HEAD /backup.7z
HEAD /backup.sql
HEAD /backup.mdb
HEAD /backup.mdf
HEAD /backup.tgz
HEAD /backup.tar.gz
HEAD /db.zip
HEAD /db.rar
HEAD /db.dat
HEAD /db.7z
HEAD /db.sql
HEAD /db.mdb
HEAD /db.mdf
HEAD /db.tgz
HEAD /db.tar.gz
HEAD /data.zip
HEAD /data.rar
HEAD /data.dat
HEAD /data.7z
HEAD /data.sql
HEAD /data.mdb
HEAD /data.mdf
HEAD /data.tgz
HEAD /data.tar.gz
HEAD /web.zip
HEAD /web.rar
HEAD /web.dat
HEAD /web.7z
HEAD /web.sql
HEAD /web.mdb
HEAD /web.mdf
HEAD /web.tgz
HEAD /web.tar.gz
HEAD /wwwroot.zip
HEAD /wwwroot.rar
HEAD /wwwroot.dat
HEAD /wwwroot.7z
HEAD /wwwroot.sql
HEAD /wwwroot.mdb
HEAD /wwwroot.mdf
HEAD /wwwroot.tgz
HEAD /wwwroot.tar.gz
HEAD /database.zip
HEAD /database.rar
HEAD /database.dat
HEAD /database.7z
HEAD /database.sql
HEAD /database.mdb
HEAD /database.mdf
HEAD /database.tgz
HEAD /database.tar.gz
HEAD /www.zip
HEAD /www.rar
HEAD /www.dat
HEAD /www.7z
HEAD /www.sql
HEAD /www.mdb
HEAD /www.mdf
HEAD /www.tgz
HEAD /www.tar.gz
HEAD /code.zip
HEAD /code.rar
HEAD /code.dat
HEAD /code.7z
HEAD /code.sql
HEAD /code.mdb
HEAD /code.mdf
HEAD /code.tgz
HEAD /code.tar.gz
HEAD /test.zip
HEAD /test.rar
HEAD /test.dat
HEAD /test.7z
HEAD /test.sql
HEAD /test.mdb
HEAD /test.mdf
HEAD /test.tgz
HEAD /test.tar.gz
HEAD /admin.zip
HEAD /admin.rar
HEAD /admin.dat
HEAD /admin.7z
HEAD /admin.sql
HEAD /admin.mdb
HEAD /admin.mdf
HEAD /admin.tgz
HEAD /admin.tar.gz
HEAD /user.zip
HEAD /user.rar
HEAD /user.dat
HEAD /user.7z
HEAD /user.sql
HEAD /user.mdb
HEAD /user.mdf
HEAD /user.tgz
HEAD /user.tar.gz
HEAD /sql.zip
HEAD /sql.rar
HEAD /sql.dat
HEAD /sql.7z
HEAD /sql.sql
HEAD /sql.mdb
HEAD /sql.mdf
HEAD /sql.tgz
HEAD /sql.tar.gz
HEAD /bf.zip
HEAD /bf.rar
HEAD /bf.dat
HEAD /bf.7z
HEAD /bf.sql
HEAD /bf.mdb
HEAD /bf.mdf
HEAD /bf.tgz
HEAD /bf.tar.gz
HEAD /beifen.zip
HEAD /beifen.rar
HEAD /beifen.dat
HEAD /beifen.7z
HEAD /beifen.sql
HEAD /beifen.mdb
HEAD /beifen.mdf
HEAD /beifen.tgz
HEAD /beifen.tar.gz
HEAD /shujuku.zip
HEAD /shujuku.rar
HEAD /shujuku.dat
HEAD /shujuku.7z
HEAD /shujuku.sql
HEAD /shujuku.mdb
HEAD /shujuku.mdf
HEAD /shujuku.tgz
HEAD /shujuku.tar.gz
HEAD /shuju.zip
HEAD /shuju.rar
HEAD /shuju.dat
HEAD /shuju.7z
HEAD /shuju.sql
HEAD /shuju.mdb
HEAD /shuju.mdf
HEAD /shuju.tgz
HEAD /shuju.tar.gz
HEAD /ziliao.zip
HEAD /ziliao.rar
HEAD /ziliao.dat
HEAD /ziliao.7z
HEAD /ziliao.sql
HEAD /ziliao.mdb
HEAD /ziliao.mdf
HEAD /ziliao.tgz
HEAD /ziliao.tar.gz
HEAD /freefixer.zip
HEAD /freefixer.com.zip
HEAD /www.freefixer.com.zip
HEAD /freefixer.rar
HEAD /freefixer.com.rar
HEAD /www.freefixer.com.rar
HEAD /freefixer.dat
HEAD /freefixer.com.dat
HEAD /www.freefixer.com.dat
HEAD /freefixer.7z
HEAD /freefixer.com.7z
HEAD /www.freefixer.com.7z
HEAD /freefixer.sql
HEAD /freefixer.com.sql
HEAD /www.freefixer.com.sql
HEAD /freefixer.mdb
HEAD /freefixer.com.mdb
HEAD /www.freefixer.com.mdb
HEAD /freefixer.mdf
HEAD /freefixer.com.mdf
HEAD /www.freefixer.com.mdf
HEAD /freefixer.tgz
HEAD /freefixer.com.tgz
HEAD /www.freefixer.com.tgz
HEAD /freefixer.tar.gz
HEAD /freefixer.com.tar.gz
HEAD /www.freefixer.com.tar.gz
Vanta Telecommunications Limited and egihosting.com are names that shows up then I did a lookup in ARIN register, as shown in the screenshot below. I’m assuming one of their customers have been hacked.
If you’ve been following this blog for the last week you know that I’ve been trying to weed out fake Bingbots, Yandexbots and Googlebots and other types of bad behaviour. Since 142.252.249.27 is currently trying to gain access to non-public information I’m going to block it in Apache’s .htaccess file.
Unfortunately this bot does not give any details on why they are crawling Freefixer.com and who is operating the bot. All I know is that it downloads one of the .RSS feeds and then start downloading all pages that the .RSS links to with a few seconds delay.
207.46.13.179 is requesting content from Freefixer.com approximately every 10 seconds. Quite often, but that’s OK. I would like Freefixer.com to appear at bing.com.
So, how do I know that this in fact is a real Bingbot, and not some unwanted program that scrapes my web site? I’ll use the same procedure as recommended over at Bing Webmasters Tools. That is, a reverse IP lookup on the IP address, and then a forward IP lookup on the results from the reverse lookup. If you end up with the same IP that you started with, and the reverse lookup reports a search.msn.com, you can rest assured that you are dealing with a legitimate bingbot.
If you do an ARIN lookup on 207.46.13.179, you’ll see that Microsoft owns the range starting from 207.46.0.0 to 207.46.255.255. So I assume you can expect bingbots from all the IP addresses.
I’m currently running FreeFixer.com on a shared Dreamhost server. Dreamhost has a monitoring service that keeps an eye on the total resource usage for each user account. If some user consumes to much resources on the server, the monitoring service starts killing off processes for that user and an email report is sent. This is great since it saves me much of the performance problems caused by other users on the same server.
Some time ago, the resource usage for freefixer.com started hitting the limit but I didn’t notice any additional traffic when I examined the Google Analytics report. This led me to investigate Apache’s access.log file. Here are two example entries from the log:
The first entry (157.55.39.252) claims to be the bingbot and the second (163.172.64.171) is a crawler called Barkrowler (exensa.com).
When examining the access.log a bunch of questions are raised:
Let’s say the crawler claims to be BingBot or GoogleBot, but is it the real one coming from one of Google’s or Microsoft’s data centers, or is it a bot that falsely set its user agent to GoogleBot or BingBot?
What about all the other bots out there? Their crawling uses quite a lot of resources, but do they bring any value or users to your web site.
What about all the other high usage IP-numbers that claims to be ordinary users? Are their claims correct, or are they just bots in disguise?
I’ll simply post each IP number that I investigate below and you can check out the details by clicking on it. You can find the list down below.
How To Determine If a Bot is Fake
Let’s say you see an entry in the log coming from 157.55.39.252 and it claims to be bingbot. How can we determine that the traffic is from a real bingbot? We can do this using the following two steps:
1) First we do a reverse DNS lookup using the IP from the log.
$ host 157.55.39.252
252.39.55.157.in-addr.arpa domain name pointer msnbot-157-55-39-252.search.msn.com.
The DNS responds with [msnbot-157-55-39-252.search.msn.com].
2) Then we do a forward DNS lookup on the hostname we got from the reverse lookup.
So, to summarise: 157.55.39.252 points to [msnbot-157-55-39-252.search.msn.com] which is owned by Microsoft. And the [msnbot-157-55-39-252.search.msn.com] hostname resolves back to 157.55.39.252 which we started with. Excellent, we now know that we are dealing with a legitimate bingbot.
Another way to check if an IP belongs to bingbot, if you don’t have the host and dig command line tools available, is to use Bing’s Verify Bingbot Tool. You simply type in the IP address, in this case 157.55.39.252, and solve the captcha.
I’m not aware of web verification tools for the other search engines such as Google or Yandex. If you know about such a tool, please let me know.
So what about 87.250.224.119 that identifies as YandexBot/3.0? Is this the legitimate YandexBot or a fake bot? This IP is one of the most frequent visitor to Freefixer.com.
As verified in the above screenshot, 87.250.224.119 is indeed the real YandexBot. You can verify this yourself by first doing a reverse IP lookup, and then a forward IP lookup. The reverse IP lookup returns a name ending with .spider.yandex.com and that name resolves back to 87.250.224.119.
All Yandex crawlers have host names ending with yandex.ru, yandex.net or yandex.com. If the host name does not end with one of these, the robot does not belong to Yandex. In that case, someone is pretending to be a Yandexbot and that is certainly bad behaviour and I would block that IP immediately.
I’ve used the host and dig tools which should be available on most platforms with a linux type shell.
The traffic is coming from Moscow, Russia. I’ve added the 87.250.224.119 IP to my whitelist.