News:

Tekforums.net - The improved home of Tekforums! :D

Main Menu

Invasion of the Baidu Spiders

Started by Smugs, December 21, 2011, 18:50:40 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Smugs

There are currently 12 Baidu crawlers infesting our beloved forums, in contrast there are only 2 google crawlers. It's an invasion I tell thee.
TekForums member since 14th August 2002

Clock'd 0Ne

I don't really know what Baidu does, but I'm struggling to work out why they need 10 spiders all trawling the site. :worried:

Smugs


Only 10 now, the other two must have gone back for reinforcements.   :drama:
TekForums member since 14th August 2002

Serious

They run a web search site, like google, amongst other options. Quite why they require so many spiders I don't know, spying for the Chinese government?  ???

Oops, did I let that Chinese secret out the bag?  :muttley:

bear

#4
Baidu has much to do, it is looking at threads way back in time. Like; http://www.tekforums.net/sports-hobbies-cycling/pompey-lift-asia-cup/

Clock'd 0Ne

Baidu seems to like a lot of your threads at the moment bear :)

bear

has a lot of catching up to do it seems

Eagle

How does one ban it from crawling a site/forum?

Clock'd 0Ne

#8
Does one and should one ban it more my query, I would think the more engines indexing the site the better?

QuoteDuring Q4 of 2010, it is estimated that there were 4.02 billion search queries in China of which Baidu had a market share of 56.6%. China's internet-search revenue share in second quarter 2011 by Baidu is 76%[6] In December 2007, Baidu became the first Chinese company to be included in the NASDAQ-100 index.

Seems it is China's answer to Google.


Oh and to answer your question directly, you can add exclusions to your robots.txt file for search engine spiders, or if your forum/CMS software supports it there might be a way of blocking them directly.

Mongoose

Robots.txt only works if the spider is 'polite', but there are supposedly ways of 'trapping' impolite spiders who ignore it.

Clock'd 0Ne

I think we have Baidu to thank for the recent spam posts, it seems they all have Chinese IPs

XEntity

14 of the tasty spam spiders at the moment