Forum Moderators: phranque

Message Too Old, No Replies

Need Help: Questions about sitemap & robots.txt

         

CaptainDark

7:53 pm on Nov 3, 2018 (gmt 0)

5+ Year Member



Hi,
I am new to SEO and for the first time, I'm doing SEO myself for my newly launched site. I'm still learning!
The site is a tool site and has only 3 pages (index, privacy policy, terms of service) and a contact us plugin and a folder of img.

So now I'm in the stage to create sitemap & robots.txt
But some questions are blocking me to proceed. I searched everywhere and haven't found any answer, so came here to ask.
Maybe the questions is too silly, but still I hope someone will answer to clear my doubts.

Questions:
1. Which file I should create first? Sitemap or robots.txt?

2. Do I need to pass protect those 2 files? Is it an important factor? - Though I'm not sure if this option is available or not.

3. As said above I have only 3 pages and contact us plugin then what should I write as rules in robots.txt?
Such as, Some yt vidoes shows cgi-bin to disallow. But I don't have any file/folder with that name in my root.

I'm extremely sorry for too many questions.

not2easy

8:22 pm on Nov 3, 2018 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Hello CaptainDark and Welcome to WebmasterWorld [webmasterworld.com]

Everyone starts somewhere by asking basic questions, don't worry about that.

Your robots.txt file is not a requirement, it is a file to tell compliant robots of any files or folders you do not wish them to visit. If you want robots to access all pages and folders of your site, you do not need a robots.txt file. It is a good idea to learn about what it is and how to use it. Google offers information [developers.google.com] that can help you understand the syntax that their robots use. Not all robots follow the directives you might have in your robots.txt file, that information from Google only fully applies to their robots. Many robots don't bother to read, or else they read but ignore the file. It does not force any behavior.

You should not password protect anything that is not private. You should create a sitemap when you have pages that you'd like to have indexed. Many people do not want their privacy and contact pages indexed, that would mean that visitors might first enter your site on those pages. If you have not set up a Google Search Console account yet, you might want to start there. From there you can submit a sitemap and learn about how your site is seen by Google. Bing also offers an account for webmasters to interface with the Bing search engine.

1. Which file I should create first? Sitemap or robots.txt?
First you should have content on your site such as pages that people might want to visit. That is what sitemaps are for, Start with the site, then add a sitemap to your GSC account. Then if you see Google's bots visiting pages you don't want visited, use robots.txt for that.

tangor

10:22 pm on Nov 3, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



@CaptainDark:

Welcome to Webmasterworld!

robots.txt is not essential, though for the bots that respect it can be useful.

Sitemaps have limitations, but can be chained to some amazing numbers of urls. Your three page doesn't need one, and the limitations don't apply to your current content.

When you get 10,000 pages it might make a difference.

Meanwhile, make the site the best it can be and good luck!

justpassing

10:58 pm on Nov 3, 2018 (gmt 0)

5+ Year Member Top Contributors Of The Month



1. Which file I should create first? Sitemap or robots.txt?

No matter. Also considering you have only 3 pages, you can create both files at the same time. The Sitemap will list your 3 pages, and the robots.txt can simply be an empty file.

2. Do I need to pass protect those 2 files? Is it an important factor? - Though I'm not sure if this option is available or not.

What do you call pass protect those files? If you mean to require a password to access these files, then of couse NO. These files need to be publicly readable. Otherwise it no longer make sense.

3. As said above I have only 3 pages and contact us plugin then what should I write as rules in robots.txt?

You can leave your robots.txt file empty. Still create a file, to avoid 404 erros in your logs.

Eventually, if you create a sitemap, you can add the following line to your robots.txt
Sitemap: xxxl

Where xxx is the absolute URL to your sitemap file.

CaptainDark

10:59 pm on Nov 3, 2018 (gmt 0)

5+ Year Member



@not2easy
Thank you very much for helping me with a descriptive reply.

No, I haven't created any GSC account yet because of the questions I had.
Last 4 days I was just looking here & there for my questions, while my site is completely ready.

I used some online free Site Reviewer to know my SEO standings. So my highest score was 81/100 in seositecheckup without robots.txt and sitemap, though there is some other 3/4 minor issues.

At least I got a guideline from you on how I should move ahead.
I will post again here when I reach to a conclusion, because to know I'm doing it right.


@tangor
Thanks to you as well for helping me out!

If I don't create sitemap then how my site gets indexed and how it can be found on google or other search engines?
Every site owners want to rank to top 3 to get google organic search, right? So without sitemap / GSC how can I achieve that?

CaptainDark

11:06 pm on Nov 3, 2018 (gmt 0)

5+ Year Member



@justpassing
Thanks for taking the time to post.
It's 5:03am in my place and I'm still studying robots.txt and sitemap, though it doesn't seems too hard.
Tomorrow I will apply all the things according to the guidelines on this thread.

@not2easy, @tangor, @justpassing
Love you guys!

tangor

11:15 pm on Nov 3, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If I don't create sitemap then how my site gets indexed and how it can be found on google or other search engines?


I've been on the web since 1996. I have never put up a site map (for my sites ... clients on the other hand seem to think they are worth something).

Your site is too small to justify a site map. As far as robots.txt, again, you have no real exposure so don't worry about it.

Get site on line and start adding content. Build your brand and go from there. Next year you might want to revisit sitemaps and robots.txt, if necessary.

tangor

11:15 pm on Nov 3, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Mean to add, g or any search engine will find your stuff whether you have sitemap or not.

lucy24

12:08 am on Nov 4, 2018 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If I don't create sitemap then how my site gets indexed and how it can be found on google or other search engines?
If your site is a dot com or other common tld within ARIN (North America), it will be found automatically as soon as it goes live. If it has already gone live, search engines have already crawled it. If search engines don't automatically find the domain, a sitemap will not help, because they haven't got that far. Either way ...

You do not need a sitemap.

You do not need a sitemap.

You do not need a sitemap.

How do your human visitors find your TOS and Privacy pages? By following clearly visible links from the front page, right? Robots can and will follow those same links.

A sitemap does not mean “index only these pages”. It means “be sure not to overlook these pages”.

justpassing

10:13 am on Nov 5, 2018 (gmt 0)

5+ Year Member Top Contributors Of The Month



I've been on the web since 1996

To quote MrSavage this is not making you an expert or trustworthy. Sarcasms (against MrS) apart, tangor is right. A sitemap is not mandatory, and especially for a small site. With or without it, this is not going to have an impact over your ranking. If you worry that Google (or others) will not find you, just use the Google Search Console, Bing Webmaster Tool, etc... and add your site. Like that SE will know you exist :) but it's highly possible they already know, and already crawled your page.

If you have nothing to block, then just put an empty robots.txt file. As I said, it's not even mandatory to have a robots.txt file, if it's empty, but, I still think it's good practice, this avoids 404 errors in the logs.

A sitemap can help robots to discover pages , within a site, that they are not discovery "naturally" (by following links). But in that case, it means the site navigation is not well done.