x

"Indexed, though blocked by robots.txt" But these pages are not blocked by the robots.txt.

Hi everyone,

 
I am using the new search console and in the "Index" -> "Coverage".
It shows that there're 56 pages indexed but blocked by the robots.txt which are NOT blocked at all! 
These pages are totally and always open for crawling and indexing.  And they are just blocked for NO REASON.
 
And then I removed all the terms in the robots.txt file and it's still the same. (See the screenshot below).
No clue why this happened. Still struggling for a solution. Any thoughts would be very appreciated. Thanks so so much!
1,447 Views
Message 1 of 12
Report
11 REPLIES 11
Square

Hi @nuo I'm not seeing the screenshot. Can you please repost? Thanks so much!

1,436 Views
Message 2 of 12
Report

Hi Bernadette,

Thanks for the response.

Below are the images:

image

1,427 Views
Message 2 of 12
Report
Square

Can you check and see what the URLs are that it is blocking? I suspect they are all backend resources in /ajax, or something like that.

1,421 Views
Message 2 of 12
Report

I'm having this same problem.

I have 45 pages that are blocked by robots.txt, none of which are set to be hidden.

Here's one example:

https://www.appliedbiochemists.com/shoreklear-plus.html

But here: https://www.appliedbiochemists.com/robots.txt you can see that this page and the others Google is saying are blocked aren't on the list.

How do I get them to not be blocked?

1,273 Views
Message 2 of 12
Report
Square

Hi @pulsar Can you post a screenshot of your Google dashboard that shows the url blocked? Does it give you any additional information? 

1,270 Views
Message 2 of 12
Report

Sure here's the warning we're getting.

imageimage

1,269 Views
Message 2 of 12
Report
Square

What happens when you click on "learn more"? 

@whitemonkey Do you have any input on this? Smiley Happy 

1,267 Views
Message 2 of 12
Report

Hi.  Ive had a few blocked before. Normally they dissappear eventually.   

I "site:" https://www.appliedbiochemists.com/shoreklear-plus.html

 its live but no info available.

Try submitting a SITEMAP.XML   again

Also.    Go to the Old google search console.  And go to CRAWL then robots txt tester  and   submitt your site there...

THEN back on the new google search console.   Inspect a BLOCKED url   and request indexing    ..  

Hope google crawls ur site again..

Thx @Bernadette  this is all i can suggest...

1,262 Views
Message 2 of 12
Report

Thanks, I resubmitted the shoreklear-plus.html url, we'll see if that does anything.
Might just take time, we did only launch a week ago.
1,259 Views
Message 2 of 12
Report

Ok good luck there. I wrote a post on my travel blog which has a sub section about weebly.

https://www.nomadicbackpacker.com/blogging-with-weebly/weebly-is-no-longer-supported-by-internet-exp...

links to my other page about how to index ur pages  is at the end of the post

626 Views
Message 2 of 12
Report
Square

Thanks for the feedback, @whitemonkey!

623 Views
Message 2 of 12
Report