For more and more websites you need to register or pay to have full access. The odd thing is that Google has the complete and full index of the website. So what's going on here? Why must regular users pay or register to have access when the google search engine bot has full access?. The reason is simple; every site wants to use the benefits of the wonderful world of Google, for webmasters free advertising is always welcome. But there is a simple way to be the Google (search)Bot. In this little article i will try to explain it.
User Agent
Almost every internet browser has the capability to adjust the user agent. A user agent is the client application used with a particular network protocol; the phrase is most commonly used in reference to those which access the World Wide Web. Web user agents range from web browsers to search engine crawlers ("spiders"), as well as screen readers and braille browsers used by people with disabilities.
When Internet users visit a web site, a text string is generally sent to identify the user agent to the server. This forms part of the
HTTP request, prefixed with User-agent: or User-Agent: and typically includes information such as the application name, version, host operating system, and language. Bots, such as web crawlers, often also include a URL and/or e-mail address so that the webmaster can contact the operator of the bot. The user-agent string is one of the criteria by which crawlers can be excluded from certain pages or parts of a website using the "Robots Exclusion Standard" (robots.txt). This allows webmasters who feel that certain parts of their website should not be included in the data gathered by a particular crawler, or that a particular crawler is using up too much bandwidth, to request that crawler not to visit those pages.
Adjusting the user agent
To change your user agent identity you can use some utility’s to help you. For Internet Explorer you can use WinGuides Tweak Manager and for Firefox/Mozilla there is this User Agent Switcher available. You can use the example user-string section to adjust it to the right value. When you change the user agent identity you can enjoy the free access that most websites will offer.
For example Windows & .Net Magazine en Nature will give you free access. And there will be millions of other sites that have the same security flaw available.Remember that not all sites are vulnerable for this security flaw, because it is real easy to secure the protected directory’s from google bots or any other bots available at the moment.
Webmasters: you can read the more about the robots exclusion protocol here.
Example user-agent strings
Browsers Internet Explorer 5.5 on Windows 2000: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)
Internet Explorer 6.0 in MSN on Windows 98: Mozilla/4.0 (compatible; MSIE 6.0; MSN 2.5; Windows 98)
Internet Explorer 6.0 on Windows XP: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Internet Explorer 7.0 beta running on Windows Longhorn: Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 6.0)
Internet Explorer 5.2 on Mac OS X: Mozilla/4.0 (compatible; MSIE 5.23; Mac_PowerPC)
Konqueror 3.1 (French): Mozilla/5.0 (compatible; Konqueror/3.1; Linux 2.4.22-10mdk; X11; i686; fr, fr_FR)
Mozilla 1.7.8 on Linux: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050511
Mozilla Firefox 1.0.4 on Windows XP: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4
Mozilla Firefox 1.0.4 on Ubuntu Linux, on AMD64: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.6) Gecko/20050512 Firefox
Mozilla Firefox 1.0.4 on FreeBSD 5.4 on i386: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.8) Gecko/20050609 Firefox/1.0.4
Netscape 4.8 on Windows 2000: Mozilla/4.8 [en] (Windows NT 5.0; U)
Netscape 7 on Sun Solaris 8: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.0.1) Gecko/20020920 Netscape/7.0
Netscape 8.0.1 on Windows XP using Gecko: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20050519
Netscape 8.0.1 on Windows XP using MSHTML (with .NET installed) : Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1
Opera 6.03 on Windows 2000, cloaked as MSIE: Mozilla/4.0 (compatible; MSIE 5.0; Windows 2000) Opera 6.03 [en]
Opera 7.23 on Windows 98: Opera/7.23 (Windows 98; U) [en]
Opera 8.00 on Windows XP: Opera/8.00 (Windows NT 5.1; U; en)
Opera 8.00 on Gentoo Linux: Opera/8.0 (X11; Linux i686; U; cs)
Safari v125 on Mac OS X: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/124 (KHTML, like Gecko) Safari/125
Safari v125 on Mac OS X, cloaked as MSIE: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2)
ELinks 0.4pre5 on Linux: ELinks (0.4pre5; Linux 2.4.27 i686; 80x25)
Links 0.99pre14 under Cygwin on Windows 2000: Links (0.99pre14; CYGWIN_NT-5.0 1.5.16(0.128/4/2) i686; 80x25)
Links 2.1pre17 under Gentoo Linux: Links (2.1pre17; Linux 2.6.11-gentoo-r8 i686; 80x24)
Lynx 2.8.4rel.1 on Linux: Lynx/2.8.4rel.1 libwww-FM/2.14
Off By One 3.5a on Windows XP: Mozilla/4.7 (compatible; OffByOne; Windows 2000)
w3m on FreeBSD: w3m/0.5.1