Computer Forums

Computer Forums (http://www.geekboards.com/forums/index.php)
-   Geek Stuff (http://www.geekboards.com/forums/forumdisplay.php?f=1)
-   -   Get robots.txt to work (http://www.geekboards.com/forums/showthread.php?t=766)

yetanotherfcw 04-21-2006 04:24 AM

Get robots.txt to work
 
So, I used the following robots.txt, but it does not seem to work.
Quote:

Disallow: /post-*.html$
Disallow: /updates-topic.html*$
Disallow: /stop-updates-topic.html*$
Disallow: /ptopic*.html$
Disallow: /ntopic*.html$
I really want this to work, so I am thinking about making some changes like the followings:
Quote:

Disallow: /post-*.html
Disallow: /updates-topic.html*
Disallow: /stop-updates-topic.html*
Disallow: /ptopic*.html
Disallow: /ntopic*.html
Quote:

Disallow: /post-*.html
Disallow: /updates-topic.html
Disallow: /stop-updates-topic.html
Disallow: /ptopic*.html
Disallow: /ntopic*.html
How can I get this robots.txt to work? What (minor) modifications do I need to make?

Soulwatcher 04-21-2006 05:02 AM

:confused: I dont know much about robots.txt. But Nikolas over at http://www.webdigity.com/ could probbaly answer all of your questions. ;)


Soulwatcher

southernlady 04-21-2006 05:08 AM

yetanotherfcw, why do you want to turn off your robot search? That's where your traffic will come from eventually. If you don't allow the google, msn, yahoo search bots to search then you limit your access to their directories. Liz

Coop 04-21-2006 06:53 AM

I'm not an expert on robots.txt either, but I was under the impression that you needed a User-agent line for it to work, so try this

Quote:

User-agent: *

Disallow: /post-*.html
Disallow: /updates-topic.html*
Disallow: /stop-updates-topic.html*
Disallow: /ptopic*.html
Disallow: /ntopic*.html
Also, not all robots understand the wildcards yet, and some completely ignor the robots.txt, so if you are having trouble with a particular crawler, you may need to add differant sections for each one, or even ban there IP blocks (usually required for the email harvestors etc).

Coop

Ashley 04-21-2006 07:22 AM

The best thing to do would be to use a robots.txt generator. Try Googling, there are several. This will make sure that bots can understand it.

Secondly, not every bot follows robots.txt, so if you want to block them you will want to use .htaccess.

Make sure that it is in the root of your site [e.g /public_html] and chmodded so that it's easy to view.

LucnetSolutions 04-21-2006 09:55 AM

Go to google and sign up for there validation. They hve step by step instructions to help you. They helped me to set mine up.

yetanotherfcw 04-21-2006 07:44 PM

Quote:

Originally Posted by LucnetSolutions
Go to google and sign up for there validation. They hve step by step instructions to help you. They helped me to set mine up.

Is there a link for this?


All times are GMT -5. The time now is 03:04 PM.

Powered by vBulletin® Version 3.8.7
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
HTML Help provided by HTML Help Central.