I received the 'Indexed, though blocked by robots.txt' warning on Google Webmaster Tools
I don't want this page appearing in search results and had ticked the 'Hide this page from search engines' box on Weebly's SEO settings for the page. As I couldn't see anything wrong with it I asked Google for a revalidation and the same error appeared. I tried unticking the box and entering <meta name="robots" content="noindex"> in the header for the page as I read here that Google can't exclude a page from the index until it is unblocked by robots.txt - I asked for revalidation and the same issue reappeared.
I suspect the robot.txt file is preventing the page being crawled thereby stopping the page being removed from the index as Google can't see the no index order. Is the fact the page has a password related?
I have since reticked the box, removed the noindex meta tag from the header, & asked for a further revalidation, but am expecting the same result as essentially nothing has changed. I don't think waiting for Goolge to recrawl my site will help as asking for a revalidation is the same thing isn't it.
How can I fix this and how did this happen in the first place as I hadn't changed the SEO settings on the page when the error first occurred?
Thanks for your question, @Magoo. Google will still know the page exists, but because of your robots.txt it's not actually going to include it with search results. Was the page ever not protected by a password and hidden from search engines?
You can see what I mean if you look through the results of a Google site search:
Thanks Adam, but that doesn't resolve my problem. What do I need to do to my website to get the Google 'Indexed, though blocked by robots.txt' warning to disappear? I'd rather not have a Google warning associated with my website due to the negative repercussions on my Google search ranking - I've just received notice that the 4th attempt at a Google validation has resulted in the same error.
The page has always been password protected and hidden from search engines.
I had success by temporarily disabling the password on the affected page, waiting for Google to recrawl the page, then reenabling the password.
If you don't want to disable the password then you could try making an identicle page that is password protected, redirect the old page to the new, and then delete the old page.
I hope this helps.
I am having the same issue. The page is hidden from navigation, password protected, as well as marked hidden from search engines. However, Google is flagging it with an "Indexed, though blocked by robots.txt" error.
The page was never indexed, as I tested with the site:eyetechds.com search function and did not see the url in question come up.
Duplicating the page and having to set redirects to resolve the issue seems like a lot more work and creates the potential for broken links to this page... is their an alternative that a Weebly support specialist can recommend?
I'm not sure why Google would say it's indexed when you can demonstrate the page definitely is not indexed, or at least it isn't ranked or included with any search results. It might just be poor wording on Google's part - in other words, they're saying we know this page exists as part of your site, but we can't read or include it in results.
I'm getting same issue for website: georgetowntxparkinson.weebly.com.
No pages are password protected or marked as hidden from search engines
Google URL inspection says URL is not available to Google
robots.txt file looks like this:
It looks like the site is indexed although Google doesn't seem to have a description for the homepage:
You might want to use their fetch-as-google tool to have Google index the site again:
I'm having this same problem with most of the pages on my site. In the example photos, I'm using the below link. Google says it's crawled but then the error messages say it's blocked by robots.txt. How can I fix this?
Also, I'm not sure if this is part of the problem, but I'm including this based on what other people were saying. I have this page hidden from navigation (although, I'm having the same problem with pages that aren't hidden), it is not password protected nor has it ever been, and it keeps getting the hide from search engines box checked. I know I have unchecked this at least twice. Is there something I'm doing to make the box auto-check itself? Would it happen when I post a blog post?
|Sep 6, 2021|
|Sep 5, 2021|
|Jun 29, 2021|
|Jun 9, 2021|
|Apr 27, 2021|
|Apr 27, 2021|
|Apr 26, 2021|
|Apr 26, 2021|
|Apr 26, 2021|
|Apr 25, 2021|