View Full Version : Robots.txt
laufer
06-27-2008, 06:04 PM
Am facut un robots.txt si vreau sa-mi spuneti daca-i corect
User-Agent: *
Disallow: /lang/
Disallow: /css/
Disallow: /images/
Disallow: /js/
Disallow: /aj_cautare.php/
Disallow: /administrare/
Disallow: /inc/
Allow: /
Vreau ca google-ul sau alt bot sa nu-mi indexeze directoarele care au Disallow.
Este corect facut?
Krumel
06-27-2008, 07:58 PM
A record contains the information for a special search engine. Each record consists of two fields: the user agent line and one or more Disallow lines. Here's an example:
User-agent: googlebot
Disallow: /cgi-bin/
This robots.txt file would allow the "googlebot", which is the search engine spider of Google, to retrieve every page from your site except for files from the "cgi-bin" directory. All files in the "cgi-bin" directory will be
ignored by googlebot.
The Disallow command works like a wildcard. If you enter
User-agent: googlebot
Disallow: /support
both "/support.html" and "/support/index.html" as well as all other files in the "support" directory would not be indexed by search engines.
If you leave the Disallow line blank, you're telling the search engine that all files may be indexed. In any case, you must enter a Disallow line for every User-agent record.
If you want to give all search engine spiders the same rights, use the following robots.txt content:
User-agent: *
Disallow: /cgi-bin/
O resursa (http://www.searchenginepromotionhelp.com/m/articles/search-engine-optimization/robots-txt-explained.php) de pe care am invatat si eu. Sper sa te lamureasca.
Krumel
07-01-2008, 04:08 PM
Managing Robot's Access To Your Website (http://janeandrobot.com/post/Managing-Robots-Access-To-Your-Website.aspx)
Controlling what content is blocked from being found in search engines is crucial for many websites. Fortunately, the major search engines and other well-behaved robots observe the Robots Exclusion Protocol (REP), which has evolved organically since the early 1990's to provide a set of controls over what parts of a web site search engines robots can crawl and index.
Article Sections:
* Capabilities of REP
* Deciding What Should be Public vs. Private
* Implementing the REP
o Site Level
o Page Level (Meta Tags)
o Page Level (HTTP Header)
o Content Level
* Common Mistakes
* Testing Your Implementation
* Removing Content From Search Engine Indices
* Additional Resources
Un excelent articol scris de Vanessa Fox.
laufer
07-02-2008, 09:08 AM
Multumesc foarte mult pentru surse. Excelente
Krumel
07-14-2008, 09:12 PM
Robots.txt : 4 Things You Should Know (http://www.searchenginejournal.com/robotstxt-4-things-you-should-know/7292/)
La fel util.
vBulletin® v3.7.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.