Re: My robots.txt is blocking Google from indexing...

stumcm · ‎09-04-2022

I am using the Free plan for Square Online, and have created an online store to sell artworks using your platform.

I have recently registered the website with Google Search Console, and done all of the appropriate steps to submit the website's sitemap to Google.

However, Google currently has a very shallow scan of my website, with only 4 out of ~50 pages displaying.

Google Search Console is currently displaying an error message: "Excluded by ‘noindex’ tag"

When I visit the auto-generated robots.txt for my website, indeed this file features this text:

User-agent: Googlebot
Disallow:
User-agent: Googlebot-Image
Disallow:

Obviously, this is something that is preventing Google from scanning my site at the moment. However, I can't see any way to modify this text using the backend of Square Online.

Can someone please tell me how to remove this code from my robots.txt file, using my Free plan for Square Online.

It is a great shame that this 'disallow Googlebot' text has been automatically added without my consent, despite all of my selected website options wanting the Store to be as visible as possible. In my opinion this auto-generated robots.txt is a bad "feature" that should be removed from Square Online for other customers. I bet that lots of Square customers are missing out on a lot of customers if they are being hidden from Google search results by the default robots.txt file.

Arie · ‎09-06-2022

Hi @stumcm,

I've reached out to our Ecom team to get some clarification on the robots.txt for Square Online Sites. They've let me know:

There isn't a way to manually edit, or remove this, however, an important distinction should be made. This code wouldn't prevent the Seller's website from indexing on a search engine. They may have an issue integrating into the Google Merchant Center, which is a project we're currently working on.

I'll let you know once I hear more or there is an update!

stumcm · ‎09-11-2022

My apologies. Upon closer inspection it does seem that the robots.txt code that I pasted is one that does not prevent Googlebot from crawling my website. So it is not the robots.txt file that is causing my problem, as mentioned by Arie.

Looking at my Google Search Console, 4 of my website's pages are indexed, and the remaining 47 pages are not indexed. These non-indexed pages are listed with the the status "Discovered - currently not indexed". Meaning (as discussed in this article) that Google has scanned my sitemap and knows that all of my product pages exist, but it has simply chosen not to crawl my site to index the pages. This is because Google only has limited resources for scanning the web, and it does not get around to scanning every single page that it knows about.

It has been this way for 10 days now, and as the owner of the website, is is annoying that it is taking so long for my Square website to get visibility via Google.

I just submitted a 'validation' request via Google Search Console, which should hopefully force Googlebot to actually index those pages. I'll let you know what happens.

The7thcolumn · ‎12-10-2022

Hello, did you find a solution to this issue. Our pages hosted by Square Online have the same status and have been like that for over 40 days. I have a feeling its a issue with the pages response time as google will delay (over and over) the crawl if the page responds in a slow manner. Checking with Google Page Speed Insights all of the square pages hosted by Square Online are very slow to load.

It is my suspicion this is why Google Search Console will not add the pages.

My robots.txt is blocking Google from indexing my Square Online store