What are tags?
Google ignoring noindex META Tag
Hello Guest
  
  • Login
• Register…
• Start blog
  • Who, Where, When
• What can I do?
• What to Read?
  • Polls
• Avatars
• Interests
  • Cities and Countries
• Random blog
• Users search
  • Search
• Games
• Tests
• QAIX
  • Сообщества
• Talxy Chat
• Horoscope
• Online
 
Зарегистрируйся!

QAIX > Search Engine Optimization > Google ignoring noindex META Tag 8 June 2006 18:16:07

  Recent blog posts: 
  They have birthday today: 
  Forums:   
  Discuss: 
  Recent forum topics: 
  Recent forum comments:
  Moderators:

Google ignoring noindex META Tag

Rik 8 June 2006 18:16:07
 I have recently noticed some of my pages showing up in the Google cache

even though the page contained a "noindex" META Tag. These are private
pages for inter office use and are not meant for public display.

Is there another META tag that will prevent Google from caching these
pages?

Since the pages are not meant for public view, I have just re-named the

files so anyone that may click them from Google will just get my not
found page.

My problem is that I really have no way to keep up with pages which
Google has ignored my noindex META. I have now included the noarchive
meta in the hopes the Googlebot might understand that one.

Any suggestions?

Rik

P.S. I posted this question in the public forum but received no
suggestions so I'm hoping some-one in here may have run in to this
problem. Please pardon my cross post.

Add comment
Paul 8 June 2006 02:03:18 permanent link ]
 On 7 Jun 2006 14:54:58 -0700, "Rik" <rik@rmcaudio.com> wrote:
I have recently noticed some of my pages showing up in the Google cache>
even though the page contained a "noindex" META Tag. These are private>pages for inter office use and are not meant for public display.>
Is there another META tag that will prevent Google from caching these>pages?>
Since the pages are not meant for public view, I have just re-named the>
files so anyone that may click them from Google will just get my not>found page.>
My problem is that I really have no way to keep up with pages which>Google has ignored my noindex META. I have now included the noarchive>meta in the hopes the Googlebot might understand that one.>
Any suggestions?>
P.S. I posted this question in the public forum but received no>suggestions so I'm hoping some-one in here may have run in to this>problem. Please pardon my cross post.

What about password protected files ?
plh
Paul

--

----== Posted via Newsfeeds.Com - Unlimited-Unrestric­ted-Secure Usenet News==----
http://www.newsfeed­s.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
Add comment
Big Bill 8 June 2006 03:25:33 permanent link ]
 On 7 Jun 2006 14:54:58 -0700, "Rik" <rik@rmcaudio.com> wrote:
I have recently noticed some of my pages showing up in the Google cache>
even though the page contained a "noindex" META Tag. These are private>pages for inter office use and are not meant for public display.>
Is there another META tag that will prevent Google from caching these>pages?>
Since the pages are not meant for public view, I have just re-named the>
files so anyone that may click them from Google will just get my not>found page.>
My problem is that I really have no way to keep up with pages which>Google has ignored my noindex META. I have now included the noarchive>meta in the hopes the Googlebot might understand that one.>
Any suggestions?

Use robots.txt to exclude the files. You can't do this with the old
ones but you can with any new ones you make.

You know robots.txt?

Have a read;

http://www.robotstx­t.org/wc/robots.html­

BB


--

http://www.kruse.co­.uk/seo-services.htm­
http://www.here-be-­posters.co.uk/lempic­ka-prints.htm
http://www.crystal-­liaison.com/armani/i­ndex.html

Add comment
Paul 8 June 2006 03:34:31 permanent link ]
 On Wed, 07 Jun 2006 23:25:33 GMT, Big Bill <kruse@cityscape.co­.uk>
wrote:
On 7 Jun 2006 14:54:58 -0700, "Rik" <rik@rmcaudio.com> wrote:>
I have recently noticed some of my pages showing up in the Google cache>>
even though the page contained a "noindex" META Tag. These are private>>pages for inter office use and are not meant for public display.>>
Is there another META tag that will prevent Google from caching these>>pages?>>
Since the pages are not meant for public view, I have just re-named the>>
files so anyone that may click them from Google will just get my not>>found page.>>
My problem is that I really have no way to keep up with pages which>>Google has ignored my noindex META. I have now included the noarchive>>meta in the hopes the Googlebot might understand that one.>>
Any suggestions?>
Use robots.txt to exclude the files. You can't do this with the old>ones but you can with any new ones you make.>
You know robots.txt?>
Have a read;>
BB

Only works with good bots though BB.

password protected is far better.
plh
Paul

--

----== Posted via Newsfeeds.Com - Unlimited-Unrestric­ted-Secure Usenet News==----
http://www.newsfeed­s.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
Add comment
Rik 8 June 2006 03:43:17 permanent link ]
 
Paul wrote:> On 7 Jun 2006 14:54:58 -0700, "Rik" <rik@rmcaudio.com> wrote:>
I have recently noticed some of my pages showing up in the Google cache> >
even though the page contained a "noindex" META Tag. These are private> >pages for inter office use and are not meant for public display.> >
Is there another META tag that will prevent Google from caching these> >pages?> >
Since the pages are not meant for public view, I have just re-named the> >
files so anyone that may click them from Google will just get my not> >found page.> >
My problem is that I really have no way to keep up with pages which> >Google has ignored my noindex META. I have now included the noarchive> >meta in the hopes the Googlebot might understand that one.> >
Any suggestions?> >
P.S. I posted this question in the public forum but received no> >suggestions so I'm hoping some-one in here may have run in to this> >problem. Please pardon my cross post.>
What about password protected files ?> plh> Paul>

The page that contains the links leading to our private pages is
password protected. That navagation page resides in a folder that is
disallowed through my robots.txt file.

The private pages in question reside in folders that contain public
pages so I was afraid to disallow anything in those folders using the
robots.txt file for fear of the bot ignoring the folder.
That's why I chose to use the noindex meta on the individual pages.

Is it common for Google to ignore META tags like the noindex,noarchive
I am currently using? I have seen Google ignore my robots.txt file
before but this is the first time I have seen them ignore the noindex
command.

Add comment
Roy Schestowitz 8 June 2006 06:59:01 permanent link ]
 __/ [ Rik ] on Thursday 08 June 2006 00:43 \__
Paul wrote:>> On 7 Jun 2006 14:54:58 -0700, "Rik" <rik@rmcaudio.com> wrote:>>
I have recently noticed some of my pages showing up in the Google cache


I will gladly take this 'problem' off your hands. Google Cache has been
problematic in recent months.

even though the page contained a "noindex" META Tag. These are private>> >pages for inter office use and are not meant for public display.>> >
Is there another META tag that will prevent Google from caching these>> >pages?


Meta tags are not most reliable as not every crawler/cacher will honour them.
Exclusions using robots.txt likewise and, in a sense, they are even worse as
they publicly expose the listing of potentially 'sensitive' pages.

Since the pages are not meant for public view, I have just re-named the>> >
files so anyone that may click them from Google will just get my not>> >found page.>> >
My problem is that I really have no way to keep up with pages which>> >Google has ignored my noindex META. I have now included the noarchive>> >meta in the hopes the Googlebot might understand that one.>> >
Any suggestions?


Also see the following:

http://www.i18nguy.­com/markup/metatags.­html

P.S. I posted this question in the public forum but received no>> >suggestions so I'm hoping some-one in here may have run in to this>> >problem. Please pardon my cross post.


I notice that Paul (or you) has reduced the distributions

What about password protected files ?


I would suggest the same. Too many times in the past I had my 'hidden' pages
indexed. This was a bit embarrassing at time. The bigger issue is with cache
as information is no longer in your control and cannot be removed from the
public eye immediately. If you call Google, however, and follow the correct
route, then you can request that they remove unwanted cache.

The page that contains the links leading to our private pages is> password protected. That navagation page resides in a folder that is> disallowed through my robots.txt file.>
The private pages in question reside in folders that contain public> pages so I was afraid to disallow anything in those folders using the> robots.txt file for fear of the bot ignoring the folder.> That's why I chose to use the noindex meta on the individual pages.>
Is it common for Google to ignore META tags like the noindex,noarchive> I am currently using? I have seen Google ignore my robots.txt file> before but this is the first time I have seen them ignore the noindex> command.


I think I have heard similar stories. They should never be trusted and there
is also a certain need for careful testing of the files, for which I know
no tools.

It's the same situation with "X-No-Archive: Yes" in newsgroups. Too many
ratbots and aggregators ignore these and, once somebody replies to messages,
all protection is stripped off. You can think of this as the equivalent of
someone scraping your 'noindex' pages, putting them in public space
elsewhere.

Best wishes,

Roy

--
Roy S. Schestowitz | Open Source Othello: http://othellomaste­r.com
http://Schestowitz.­com | SuSE GNU/Linux В¦ PGP-Key: 0x74572E8E
3:45am up 41 days 9:18, 11 users, load average: 0.30, 0.70, 0.87
http://iuron.com - help build a non-profit search engine
Add comment
Paul 8 June 2006 10:33:31 permanent link ]
 On Thu, 08 Jun 2006 03:59:01 +0100, Roy Schestowitz
<newsgroups@schesto­witz.com> wrote:
I notice that Paul (or you) has reduced the distributions

eh ? In Engish ?

plh
Paul

--

----== Posted via Newsfeeds.Com - Unlimited-Unrestric­ted-Secure Usenet News==----
http://www.newsfeed­s.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
Add comment
Big Bill 8 June 2006 11:15:39 permanent link ]
 On 7 Jun 2006 16:43:17 -0700, "Rik" <rik@rmcaudio.com> wrote:
Paul wrote:>> On 7 Jun 2006 14:54:58 -0700, "Rik" <rik@rmcaudio.com> wrote:>>
I have recently noticed some of my pages showing up in the Google cache>> >
even though the page contained a "noindex" META Tag. These are private>> >pages for inter office use and are not meant for public display.>> >
Is there another META tag that will prevent Google from caching these>> >pages?>> >
Since the pages are not meant for public view, I have just re-named the>> >
files so anyone that may click them from Google will just get my not>> >found page.>> >
My problem is that I really have no way to keep up with pages which>> >Google has ignored my noindex META. I have now included the noarchive>> >meta in the hopes the Googlebot might understand that one.>> >
Any suggestions?>> >
P.S. I posted this question in the public forum but received no>> >suggestions so I'm hoping some-one in here may have run in to this>> >problem. Please pardon my cross post.>>
What about password protected files ?>> plh>> Paul>>
The page that contains the links leading to our private pages is>password protected. That navagation page resides in a folder that is>disallowed through my robots.txt file.>
The private pages in question reside in folders that contain public>pages so I was afraid to disallow anything in those folders using the>robots.txt file for fear of the bot ignoring the folder.>That's why I chose to use the noindex meta on the individual pages.>
Is it common for Google to ignore META tags like the noindex,noarchive>I­ am currently using? I have seen Google ignore my robots.txt file>before but this is the first time I have seen them ignore the noindex>command.

I don't think anything takes any notice of meta commands.

BB
--

http://www.kruse.co­.uk/seo-services.htm­
http://www.here-be-­posters.co.uk/lempic­ka-prints.htm
http://www.crystal-­liaison.com/armani/i­ndex.html

Add comment
Eric Johnston 8 June 2006 12:17:41 permanent link ]
 
"Paul" <lamewolf2004[REMOVE]@yahoo­.com> wrote in message
news:ejoe821oq38jo6­oatjs6162ikda8h70jgh­@4ax.com...> On Wed, 07 Jun 2006 23:25:33 GMT, Big Bill <kruse@cityscape.co­.uk>> wrote:>
On 7 Jun 2006 14:54:58 -0700, "Rik" <rik@rmcaudio.com> wrote:>>
I have recently noticed some of my pages showing up in the Google cache>>>
even though the page contained a "noindex" META Tag. These are private>>>pages for inter office use and are not meant for public display.>>>
Is there another META tag that will prevent Google from caching these>>>pages?>>>
Since the pages are not meant for public view, I have just re-named the>>>
files so anyone that may click them from Google will just get my not>>>found page.>>>
My problem is that I really have no way to keep up with pages which>>>Google has ignored my noindex META. I have now included the noarchive>>>meta in the hopes the Googlebot might understand that one.>>>
Any suggestions?>>
Use robots.txt to exclude the files. You can't do this with the old>>ones but you can with any new ones you make.>>
You know robots.txt?>>
Have a read;>>
Only works with good bots though BB.>
password protected is far better.> plh> Paul>
-- >
----== Posted via Newsfeeds.Com - Unlimited-Unrestric­ted-Secure Usenet > News==----> http://www.newsfeed­s.com The #1 Newsgroup Service in the World! 120,000+ > Newsgroups> ----= East and West-Coast Server Farms - Total Privacy via Encryption > =----

See this http://www.google.c­o.uk/intl/en/webmast­ers/remove.html

Also, it is a good idea to have a default home page in every subdirectory
(e.g. called index.html) to prevent the server revealing listings of all
files present.

Best regards, Eric.


Add comment
Eric Johnston 8 June 2006 12:34:29 permanent link ]
 
"Eric Johnston" <nospam@redyonder.c­o.uk> wrote in message
news:F2Rhg.309135$t­c.194961@fe2.news.bl­ueyonder.co.uk...>
"Paul" <lamewolf2004[REMOVE]@yahoo­.com> wrote in message > news:ejoe821oq38jo6­oatjs6162ikda8h70jgh­@4ax.com...>> On Wed, 07 Jun 2006 23:25:33 GMT, Big Bill <kruse@cityscape.co­.uk>>> wrote:>>
On 7 Jun 2006 14:54:58 -0700, "Rik" <rik@rmcaudio.com> wrote:>>>
I have recently noticed some of my pages showing up in the Google cache>>>>
even though the page contained a "noindex" META Tag. These are private>>>>pages for inter office use and are not meant for public display.>>>>
Is there another META tag that will prevent Google from caching these>>>>pages?>>>>­
Since the pages are not meant for public view, I have just re-named the>>>>
files so anyone that may click them from Google will just get my not>>>>found page.>>>>
My problem is that I really have no way to keep up with pages which>>>>Google has ignored my noindex META. I have now included the noarchive>>>>meta in the hopes the Googlebot might understand that one.>>>>
Any suggestions?>>>
Use robots.txt to exclude the files. You can't do this with the old>>>ones but you can with any new ones you make.>>>
You know robots.txt?>>>
Have a read;>>>
Only works with good bots though BB.>>
password protected is far better.>> plh>> Paul>>
-- >>
----== Posted via Newsfeeds.Com - Unlimited-Unrestric­ted-Secure Usenet >> News==---->> http://www.newsfeed­s.com The #1 Newsgroup Service in the World! 120,000+ >> Newsgroups>> ----= East and West-Coast Server Farms - Total Privacy via Encryption >> =---->
Also, it is a good idea to have a default home page in every subdirectory > (e.g. called index.html) to prevent the server revealing listings of all > files present.>
Best regards, Eric.

Further to this it is also a good idea to make sure the Google toolbar PR
display is turned off whenever you or your colleagues view your private
documents otherwise you are telling Google the file names. (read the
privacy implications about the PR display
http://www.google.c­om/support/toolbar/?­quick=privacy&hl=en&­v=3.0 )

Ideally, of course, all your private documents should be simply deleted from
the public area of your server.

Best regards, Eric.


Add comment
Borek 8 June 2006 12:55:30 permanent link ]
 On Thu, 08 Jun 2006 10:17:41 +0200, Eric Johnston <nospam@redyonder.c­o.uk>
wrote:
Also, it is a good idea to have a default home page in every subdirectory> (e.g. called index.html) to prevent the server revealing listings of all> files present.

You don't need such tricks if the server is properly configured - ie you
can deny directory listings in .htaccess.

Best,
Borek
--
http://www.chembudd­y.com
http://www.ph-meter­.info/pH-Nernst-equa­tion
http://www.terapia-­kregoslupa.waw.pl
Add comment
Rik 8 June 2006 15:32:08 permanent link ]
 
Borek wrote:> On Thu, 08 Jun 2006 10:17:41 +0200, Eric Johnston <nospam@redyonder.c­o.uk>> wrote:>
Also, it is a good idea to have a default home page in every subdirectory> > (e.g. called index.html) to prevent the server revealing listings of all> > files present.>
You don't need such tricks if the server is properly configured - ie you> can deny directory listings in .htaccess.>

Thanks for all the suggestions gentlemen. I now have my research cut
out for me and you guys have given me ideas on what to look for.

I don't allow any of my staff to load the Google toolbar on company
machines. I figured there is no reason to support Google when it comes
to compiling demographics info that they will sell later.

Thanks again.
Rik

Add comment
Davidof 8 June 2006 16:52:44 permanent link ]
 Eric Johnston wrote:>
Further to this it is also a good idea to make sure the Google toolbar PR > display is turned off whenever you or your colleagues view your private > documents otherwise you are telling Google the file names.

Yes I suspect this is how Google found these pages and how it finds a
lot of "private" stuff that is in its index.

-------------------­---
http://www.abcseo.c­om/
Add comment
Big Bill 8 June 2006 16:56:16 permanent link ]
 On 8 Jun 2006 04:32:08 -0700, "Rik" <rik@rmcaudio.com> wrote:
Borek wrote:>> On Thu, 08 Jun 2006 10:17:41 +0200, Eric Johnston <nospam@redyonder.c­o.uk>>> wrote:>>
Also, it is a good idea to have a default home page in every subdirectory>> > (e.g. called index.html) to prevent the server revealing listings of all>> > files present.>>
You don't need such tricks if the server is properly configured - ie you>> can deny directory listings in .htaccess.>>
Thanks for all the suggestions gentlemen. I now have my research cut>out for me and you guys have given me ideas on what to look for.>
I don't allow any of my staff to load the Google toolbar on company>machines. I figured there is no reason to support Google when it comes>to compiling demographics info that they will sell later.>
Thanks again.>Rik

I don't think they sell what they know. They figure out ways to use it
inhouse, which is maybe scarier. I suspect they look at the knowledge
they accumulate and create ways to profit from it. They pursue
independence at many levels.

BB
--

http://www.kruse.co­.uk/seo-services.htm­
http://www.here-be-­posters.co.uk/lempic­ka-prints.htm
http://www.crystal-­liaison.com/armani/i­ndex.html

Add comment
Davidof 8 June 2006 16:58:38 permanent link ]
 Roy Schestowitz wrote:> If you call Google, however, and follow the correct> route, then you can request that they remove unwanted cache.

Roy (and Rik),

It is usually much easier. Do as Rik did and delete (or rename) the
pages (so that they give a 404 not found server response) then go to URL
controller, sign up if you don't have an account, then give the URL(s)
of the pages to remove.

http://services.goo­gle.com:8882/urlcons­ole/controller?cmd=r­eload&lastcmd=login

Within a couple of days Google will delete them from the index (and
cache). This doesn't work for image files though (IME).


The meta tags should work with the Googlebot, see O'Reilly's site for
details,

http://hacks.oreill­y.com/pub/h/220

sometimes people mistype the tags.

However the basic message is that no private document should be
accessible via a public URL.

-------------------­---
http://www.abcseo.c­om/
Add comment
Rik 8 June 2006 17:59:33 permanent link ]
 
davidof wrote:> Roy Schestowitz wrote:> > If you call Google, however, and follow the correct> > route, then you can request that they remove unwanted cache.>
Roy (and Rik),>
It is usually much easier. Do as Rik did and delete (or rename) the> pages (so that they give a 404 not found server response) then go to URL> controller, sign up if you don't have an account, then give the URL(s)> of the pages to remove.>
Within a couple of days Google will delete them from the index (and> cache). This doesn't work for image files though (IME).>
The meta tags should work with the Googlebot, see O'Reilly's site for> details,>
sometimes people mistype the tags.>
However the basic message is that no private document should be> accessible via a public URL.>
-------------------­---> http://www.abcseo.c­om/

David,
It seems logical to me that delivering a 404 to the bot would
eventually get the page removed from cache. Does anyone know if this is
the case?

Roy,
The meta tag link you provided showed the examples of the tags in all
upper case. Do you know if it makes a difference whether the tags are
upper or lower case?

BB,
I don't think Google would sell the info they collect directly. It's my
bet that they are farming the information to offer advertisers more
targeted clients. (At a premium price of course)
What truly baffles me is that all Google has to do offer some wacky
feature and people flock to sign up for the latest spyware Google is
offering. But I digress. I'm not a big fan of Google but since they
hold the cards right now, I have to play the game.

Rik

Add comment
Paul 8 June 2006 18:16:07 permanent link ]
 On 8 Jun 2006 06:59:33 -0700, "Rik" <rik@rmcaudio.com> wrote:
Do you know if it makes a difference whether the tags are>upper or lower case?

only for validation (eg, xhtml only validates in lowecase)
hth
plh
Paul

--

----== Posted via Newsfeeds.Com - Unlimited-Unrestric­ted-Secure Usenet News==----
http://www.newsfeed­s.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----
Add comment
 

Add new comment

As:
Login:  Password:  
 
 
  
 
Пожалуйста, относитесь к собеседникам уважительно, не используйте нецензурные слова, не злоупотребляйте заглавными буквами, не публикуйте рекламу и объявления о купле/продаже, а также материалы нарушающие сетевой этикет или УК РФ.


QAIX > Search Engine Optimization > Google ignoring noindex META Tag 8 June 2006 18:16:07

see also:
optimising the following query
Great website - Too bad no one sees it…
What To Join?
пройди тесты:
see also:
IBM acronyms
Erase'em All!
Combating SpamTerrorism

  Copyright © 2001—2008 QAIX
Idea: Miсhael Monashev
Помощь и задать вопросы можно в сообществе support.qaix.com.
Сообщения об ошибках оставляем в сообществе bugs.qaix.com.
Предложения и комментарии пишем в сообществе suggest.qaix.com.
Информация для родителей.
Write us at:
If you would like to report an abuse of our service, such as a spam message, please .