|
 |

04-21-2006, 04:24 AM
|
Member
GB Advanced User
|
|
Join Date: Mar 2006
Posts: 60
|
|
Get robots.txt to work
So, I used the following robots.txt, but it does not seem to work.
Quote:
Disallow: /post-*.html$
Disallow: /updates-topic.html*$
Disallow: /stop-updates-topic.html*$
Disallow: /ptopic*.html$
Disallow: /ntopic*.html$
|
I really want this to work, so I am thinking about making some changes like the followings:
Quote:
Disallow: /post-*.html
Disallow: /updates-topic.html*
Disallow: /stop-updates-topic.html*
Disallow: /ptopic*.html
Disallow: /ntopic*.html
|
Quote:
Disallow: /post-*.html
Disallow: /updates-topic.html
Disallow: /stop-updates-topic.html
Disallow: /ptopic*.html
Disallow: /ntopic*.html
|
How can I get this robots.txt to work? What (minor) modifications do I need to make?
|

04-21-2006, 05:02 AM
|
 |
Senior Member
GB GEEK
|
|
Join Date: Feb 2006
Posts: 309
|
|
 I dont know much about robots.txt. But Nikolas over at http://www.webdigity.com/ could probbaly answer all of your questions.
Soulwatcher
|

04-21-2006, 05:08 AM
|
 |
Junior Member
GB Beginner
|
|
Join Date: Apr 2006
Posts: 20
|
|
yetanotherfcw, why do you want to turn off your robot search? That's where your traffic will come from eventually. If you don't allow the google, msn, yahoo search bots to search then you limit your access to their directories. Liz
|

04-21-2006, 06:53 AM
|
Member
GB Beginner
|
|
Join Date: Apr 2006
Posts: 34
|
|
I'm not an expert on robots.txt either, but I was under the impression that you needed a User-agent line for it to work, so try this
Quote:
User-agent: *
Disallow: /post-*.html
Disallow: /updates-topic.html*
Disallow: /stop-updates-topic.html*
Disallow: /ptopic*.html
Disallow: /ntopic*.html
|
Also, not all robots understand the wildcards yet, and some completely ignor the robots.txt, so if you are having trouble with a particular crawler, you may need to add differant sections for each one, or even ban there IP blocks (usually required for the email harvestors etc).
Coop
__________________
Coop
|

04-21-2006, 07:22 AM
|
 |
Junior Member
GB Newbie
|
|
Join Date: Mar 2006
Posts: 12
|
|
The best thing to do would be to use a robots.txt generator. Try Googling, there are several. This will make sure that bots can understand it.
Secondly, not every bot follows robots.txt, so if you want to block them you will want to use .htaccess.
Make sure that it is in the root of your site [e.g /public_html] and chmodded so that it's easy to view.
|

04-21-2006, 09:55 AM
|
 |
Member
GB Beginner
|
|
Join Date: Mar 2006
Posts: 49
|
|
Go to google and sign up for there validation. They hve step by step instructions to help you. They helped me to set mine up.
|

04-21-2006, 07:44 PM
|
Member
GB Advanced User
|
|
Join Date: Mar 2006
Posts: 60
|
|
Quote:
Originally Posted by LucnetSolutions
Go to google and sign up for there validation. They hve step by step instructions to help you. They helped me to set mine up.
|
Is there a link for this?
|
 |
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|