Bad indexation of the site is one of the major problems of webmasters. Why your site or its individual pages are still not in the index? To answer this question, we need to conduct a little analysis. Below is the list of the main causes of bad indexation, which may be used as a checklist for solving problems.
There are 5 main reasons why a site or individual documents may not get in the index (or do it with difficulty):
- Robot doesn’t know about your website or documents
- The website or its part is not available for the robots
- The site is in the blacklist
- There is a technical error
- Individual pages and sections show the poor quality
Let’s discuss them in detail.
The robot does not know about the site or documents
Robot may not to know about your website by different reasons.
Few time has been passed
To make the robot learn about a site or a new page, you need time when it will find this site or this page. You may make this process faster with search engines tool “add url”.
If you notice that a bot visited our website but the page still not in the index you need to wait an update.
No one source refer to your website or a document
If a website updates rarely, the bots also visits it rarely. Adding new pages you need to be sure that there are links on your website leading to new created pages. It helps bots to find them and add in the index.
Your website or its part is not available for the robots
Even if the search engines know about your website, we can consciously or unconsciously close him access to individual sections and documents.
The domain is not delegated (or removed from the delegation because of some complaint)
Make sure that you purchased the delegated domain and it is available not only to you but also to other Internet users. Ask your friends from another city to visit the site and check if it opens.
Also, due to the adoption of the law on piracy, some sites may be removed from the delegation. This is a rare case but if you put a pirated content (online movies, music videos, games and other intellectual property), it is quite possible that someone can make a complaint against you.
Closed in file robots.txt
Open file /robots.txt in the root folder (if it exists) and make sure that all that you need to be indexed is open for indexation. Some website developers during testing add directive “Disallow: /” (it prohibits the indexation of the whole site) and forget to remove it.
Closed with the help of “meta-robots”
Tag “meta-robots” is placing in tag “head” and it’s the second way to prohibit indexing of documents. Some CMS, for example WordPress, allows flexibility to manage it. Not every editor remembers about this tag.
Closed with the help of IP or User-Agent
You need to solve this problem with your hosting company. IP address gets into a blacklist by a mistake very rarely but it happens. You may to check it analyzing server’s logs (access_log), if you want to know exactly if the bots visited your website.
Closed via http-headers X-ROBOTS-TAG
Webmasters use this method rarely but you may use http-headers to prohibit indexing. You also may to check it using, for example Firebug extension for Firefox.
Using Flash or Ajax navigation
Search engine bots index flash or ajax with difficulty. If these elements hide the navigation, it will be difficult for search engine bots to find and index them. If you want to see your website like Googlebot, you may do it in Google webmasters, there are such a possibility.
Closed with “noindex”
When you close something “unnecessary” from the indexation, you may close something necessary by mistake. Never use “noindex”, if you are not sure that you can do it right. Many webmasters using it, just do harm to their websites.
The site is in the blacklist
There are several reasons why the website may get in the blacklist of search engines. It causes the absence of indexation. I list the main reasons here:
Search engines impose sanctions against your website (It was punished by search engines)
Sometimes imposition of sanctions is evident and sometimes we even don’t know about it (when we just purchased a website). Anyway, you need to be sure that your domain is clean. Usually the search engines impose sanctions against your website if:
- Manipulates SERP with the help of aggressive optimization techniques (hide SEO content and replaces the content for users, try to promote the website with the help of spam)
- Created only for search engines and not for human (visitors can find there nothing useful)
- It’s a copy of already existent website (different domains and one owner)
- Have a bad domain history
Your resource spreads viruses
It happens that the hackers crack sites and place there a malicious code or a malware. When a search engine finds it, it stops to index the website until this malware is removed. To identify such a problem in time you need to monitor your website and its panel for webmasters.
There is a technical error
Often, the cause of bad indexation of a site is an elementary technical error, the elimination of this error quickly solves the problem.
Response status code must be 200 for those pages that must be in index. You may check it by different ways, there a lot of programs for that purpose (for example, Screaming Frog, Firebug, Xenu etc.).
I faced with the cases when before DOCTYPE in HTML code, there were additional tags (<?xml or <script>) prohibiting the pages to get in index. It is necessary to ensure that the code corresponds to the html-standards, and the robot can easily identify the type of content and its main blocks.
The first case of incorrect use of redirects is when webmasters use 302 redirect instead of 301 redirect. In this case, the old pages will not be replaced with new ones in the index as a temporary redirect is used instead of permanent.
The second case of bad indexation because of redirects is the use of the tag rel = “canonical” with an indication the same canonical page for all documents.
Problems with coding
There are several ways to inform the robot about the encoding of the document: meta-tags, http-headers and the content. Normally, the process of determining the encoding is easy as pie for the search engines. However, there are rare occasions when http-headers say one think and meta-tags say another. Then just a set of symbols get in index showing the poor quality of the content.
Individual pages and sections show the poor quality
If everything is fine with the site in technical terms and there are no complaints about the use of aggressive methods of optimization, the search engine indexes the site gradually. First, he gives a small quota on the number of required indexed pages. If after statistics accumulation, a search engine sees that the pages show a good quality, the quota increases and more pages can get in the index. What signs indicate good or poor quality of documents?
Content already exists on other sites (not unique or duplicate content)
Before indexation search engines do not know if the content is unique or not. If after indexation, you’ll continue to publish non unique content, you website will disappear from index because search engines don’t accept it. There are no sense to index the same information several times.
Content already exists in the other sections of this site
I mean duplicate pages of one website. CMS often make many duplicate pages; you need to close these pages from indexation. Make sure that each page has a value for users.
Volume of unique text on the page is less than 500 characters
Small volume of unique text complicate-s the search algorithm to determine the value of this content for users. Frequently, the pages containing 80-100 words hardly get in the index. It’s better to write not only unique but long text.
Documents in this section have label headlines and texts
Search engines do not like the patterns (when between pages only one or two words change and the rest of the content remains the same) and try not to index a lot of page templates. If you want your pages to get in the index, write their titles and meta-descriptions by yourself.
Pages of more than 4-level of nesting
The more the level of nesting of pages, the lower its weight and importance for search engines (for a user as well). The most important pages for indexation that lie deep into the site, you need to make them of 2-3 level of nesting with additional relinking.
Large number 404 pages
Increase in not found errors can cause the stop of indexation. Therefore, it is necessary to monitor your site for various errors with the help of Google Webmasters.
Slow upload speed
Slow upload speed (page speed) because of the problems in the hosting service or CMS does not allow the robot to index the site quickly. Simple optimization of page speed can significantly improve its indexing.
Of course, there are other causes of bad indexation of a site. If none of the above items does not suit you, you need to contact customer support of a search engine or contact specialists.