Access Websites As Google Bot - gHacks Tech News

Access Websites As Google Bot

Google bot is the general term for Google's automated web crawling service that is linked to the Google search engine. Google sends out requests to webpages that use a Google Bot user agent. This specific user agent is used for several purposes including identification and restrictions.

Webmasters can for instance filter out Google Bot from their website statistics to get a better picture of how many real users visit the site in a given time.

Some webmasters and services on the other hand try to cheat by allowing Google Bot access to all of their contents while they display a registration or buy page to users who want to access the same information.

That's not allowed according to Google's terms of use but some webmasters do it nevertheless.

Some users had now the idea to pose as Google Bot to access the information without buying or registering first.

Be The Bot is a website that simplifies the process. It contains a form where a web address can be entered. The user can also select to pose as Google Bot or Yahoo Bot. The requested url will then be displayed on the same screen.

bethebot google bot

Have you ever been googleing something, and you see exactly what you need in the preview, but when you click the link it doesnt show you what you want to see?
This is because the owners of the site are trying to trick you into buying something, or registering. It's a common tactic on the internet. When Google visits the site, it gives something called a "Header". This header tells the site who the visitor is. Google's header is "Googlebot". The programmers of the site check to see if the header says "Googlebot", and if it does, it opens up all of its content for only googles eyes.

This works on all pages that allow Google Bot or Yahoo Bot complete access to their website but block visitors by asking them to register or buy first.

It works for instance on the Washington Post website which asks visitors to register before they can read the contents that are posted on the site. Copying the url from the website of the Post or opening washingtonpost.com in the url form at Be The Bot will provide unrestricted immediate access to the contents. (via Online Tech Tips)

We need your help

Advertising revenue is falling fast across the Internet, and independently-run sites like Ghacks are hit hardest by it. The advertising model in its current form is coming to an end, and we have to find other ways to continue operating this site.

We are committed to keeping our content free and independent, which means no paywalls, no sponsored posts, no annoying ad formats or subscription fees.

If you like our content, and would like to help, please consider making a contribution:

Comments

  1. Miguel said on May 5, 2010 at 4:52 pm
    Reply

    This may be really useful, thanks :)

    Sometimes you can get the full content of those pages by clicking the “Cache” link on the Google results page. But when there is no such link, this site may be useful.

  2. George said on May 6, 2010 at 8:50 pm
    Reply

    Maybe we want to stick with Chrome or Opera. I know FF is slow as balls on my computer.

  3. Ody20 said on September 24, 2011 at 9:14 pm
    Reply

    world boss

Leave a Reply

Check the box to consent to your data being stored in line with the guidelines set out in our privacy policy

Please note that your comment may not appear immediately after you post it.