Forum Moderators: Robert Charlton & goodroi
If it is exactly www.domain.com/some.folder/some.page.html again, then you have no problems.
If it is something like www.domain.com/some.folder/some.page.html?action=next then you have created a ghost URL - "duplicate content" - for that page, and that will be a big problem for the site.
Again, check what I wrote in the other thread [webmasterworld.com].
And when I perform a site: query on some of our sites I get RUSSIAN and CHINESE url's here and there....? What the hell is that?
for one of our sites it even returns a Google group Url....totally unrelated and super weird!
I'm still wondering if my store's directory structure hierarchy is not straightforward and logical enough. I've been thinking that maybe I should just try to simplify everything. My intent was to try to keep visitors moving through the store, but maybe I've made it too confusing.
Can any of the affected sites try this command in google and give feedback on the results you see. Search Google
"www.yourdomain.com */"
look for especially for domains starting with numeric sub domains within the results.
Be interested in any feedback.
[edited by: Pirates at 12:37 am (utc) on Sep. 20, 2006]
[edited by: Marval at 1:23 am (utc) on Sep. 20, 2006]
Re
Can any of the affected sites try this command in google and give feedback on the results you see. Search Google
"www.yourdomain.com */"
look for especially for domains starting with numeric sub domains within the results.
Be interested in any feedback.
I did that on a number of my sites and noticed some really pathetic MFA scrapers. I found two different domains after checking about 5 of my sites. The sub domains were not numeric, rather they were the topic of that page.. i.e. shredded-widgets.spammersdomain.net
They all followed the same pattern
The keywords in H1
A google large block ad
- The keyword and ad fill everything above the fold.
Below that is text ripped off from about a dozen websites. The url to our site (and the others) is NOT a link. It is plain text.
The Keywords are repeated at the bottom in H1 again. However it is all run together as one word
When you view the source, you see that the alternate ad is another website. The scary thing here is that the alternate ad address is the root of a website.. ie. www.anotherspammer.com. What is scary about it is that if adsense doesn't have an ad, you get directed to that site. That could be anything including a drive by software installer.
At the bottom of the page, if you look in the source, is a bunch of links to other spam-mfa pages in the same domain on completely unrelated topics.
ie a class=style1 href=http:Adifferent-mfa.spammerdomain.net
WARNING - SPEW ALERT.. SWALLOW YOUR DRINK BEFORE PROCEEDING.
Here is the kicker.. If you take the keyword phrase used on the page and plug it into google and add the domain name, I get about a half dozen results.
NOT ONE OF THEM IS TAGGED AS SUPPLEMENTAL.
I'm working my silly posterier off with tech articles, taking trips and photos and writing original reviews and lots of my pages languish in supplemental land.. These jokers use an auto program, break just about every rule, Steal our content, Possibly cause our original content to get hit with a dupe penalty, Have no original content and they are making money.
AAAAAAARRRRRRRGGGGGHHHHhHHHH!
The footprint of this algo exploit looks obvious enough -- they simply MUST be aware of it at the 'Plex. But there must be some bigger challenge in catching it that we're not seeing.
I have also noticed the last few days that if you do a search for <keyword removed> in google, it only shows 13 pages deep (esults 121 - 129 of about 73,000,000 are all that are shown). You can then click to add ommitted results... This makes me think that the update is not complete, and with yahoo's stock dropping so quick, I doubt google is anxious to make a publis statement about a problem with it's current live results.
If this is not an infrastructure problem, then I would say it is a poorly rolled out "update", that makes the current, online, live google we are using a crippled, fake google. Google has advertised and flaunted (as have other Search engines) that it has indexed billions of pages (or whatver high number that has been used to awe people), and that is a big part of why so many people use or began using and talking about google. I think there should be a disclosure on the google.com page should tell users that currently google is not giving search results to the billions of pages that it use to advertise.
I applaud google for trying to make search better, and it must be a daunting task, since even google can't get it right (all ther bad pages showing up), however this was a very poor rollout.
From some of the remarks here, it only makes me laugh more about the problems google has. If indeed the billion dollow google can not figure out that a next link is a next link and not duplicate content, then google should be alos putting a beta symbol on it's home page.
At first glimpse of the new results I saw spam and 404 pages, and I actually begain filling out a spam report form, then it dawned upon me; if my site is on page 10 of search results, and many of the results above mine are 404s, and not so relevant content, then it is really in my best interest to not notify google about the spam, and leave the crap there, so people will click there and then possibly get to my result because of it.
I'd like to rant about some other things google can't seem to figure out some other issues, but I have sites to make user friendly. As google has mentioned make them for the users, so I will, they will be in flash and so I expect that google won't much like them, even if users do find my site more enjoyable to use.
<Sorry, no specific searches.
See Forum Charter [webmasterworld.com]>
[edited by: tedster at 2:38 am (utc) on Sep. 20, 2006]
cmendla, I found a few sites exactly as you described too.
How can we combat that crap?
I have no real idea. Right now I'm going to spend about an hour or so adding content to my sites. I'm hoping someone a lot smarter than me will come up with an answer or google will find a fix - - until the next exploit.
Can any of the affected sites try this command in google and give feedback on the results you see. Search Google
"www.yourdomain.com */"
look for especially for domains starting with numeric sub domains within the results.
Be interested in any feedback.
as well, and found several sites in the gogole index that show one of my urls in thier google description, but when I click on the sites, it redirects to another site entirely, so I have no way to view the page source and see how they are using my url...
I had found another iste recently that had my url with a bad descrip, no link to my site, and they even had a "sponsored result" linked off of th text of my url! Isn't this illegal somehow?
What exactly is the function of this "*/" search operator? Does this show all bad things in google's eyes? I have like 100 somehow! How is this being done to me! Oh the agony!
[edited by: Pirates at 2:49 am (utc) on Sep. 20, 2006]
Can any of the affected sites try this command in google and give feedback on the results you see. Search Google
"www.yourdomain.com */"
look for especially for domains starting with numeric sub domains within the results.
Be interested in any feedback.
Very interesting. On my site which lost traffic on 9/15 I see a lot of numeric subdomain listings which go to redirects/PPC/MFA pages. My sites which were uneffected on 9/15 and my competition don't show any of these numeric subdomain listings.
Looks like Google still has the "5 billion URL spammer" issue.
ON 64.233.167.104 It's got all my pages supplemental, bad enough, but the really strange thing is that there is a url in the results that is not on my site, not related to my site, not even in my language.
rden17, Ok, I just went to the 64.233.167.104 datacenter. I have 904 pages showing, but upon checking for supplemental results only the first three pages are NOT supplemental.
When I go to www.google.com without entering a datacenter ip, I get nearly 900 results with probably about half in supplemental results.
If you just go straight to your browser and type in www.google.com without putting in the ip address, how does one find out what datacenter that information comes from?
About datacenters, anyway, is one more important than others? Since the September 15, data push, or update, or refresh, or whatever the heck that crap was, I lost over half my traffic. I'm still number one for some keyword phrases but not for others I used to be number one for.
It's very difficult to know what someone has suddenly done "wrong" with their sites when G is so mysterious. It is hard to correct something when you are not sure of what you have done "wrong". I'd gladly correct it if I knew what was the right thing last month but is now the wrong thing this month.
Also, I have a question about the 301 redirects which I've never done before.
I know how to do it, but not sure which way it is supposed to be done.
Do I go redirect FROM www.mysite.com TO mysite.com
OR do I redirect FROM mysite.com TO www.mysite.com
This part is confusing to me and I really do not know why it matters. Thanks for your help.
know how to do it, but not sure which way it is supposed to be done.Do I go redirect FROM www.mysite.com TO mysite.com
OR do I redirect FROM mysite.com TO www.mysite.com
Whichever you prefer.
I use mysite.com; others prefer www.mysite.com. Take your pick.
One of the sites that came up was an Armenian forum that I translated so I could see what it was about and they were talking about being in an alcohol coma and parties and such. Trust me, nothing at all to do with my site.
I am really hoping these horrible search results are temporary. Whatever they did this time was the worst that I've ever seen!
You can find a website with multiple sub domains and having the duplicate conent throught all the subdomains and occupying Google's SERPS full page. yes. that is from 21 to 24, 26 to 29.
1. subdomain1.-------------.com/news/------------
2. subdomain2.-------------.com/news/------------
3. subdomain3.-------------.com/news/------------
-----
This shows how google SERPS are severly affected with SPAM websites and genuine website owners getting affected.
Google should re think about their changes after June 27th.
<Sorry, no specific search terms.
See Forum Charter [webmasterworld.com]>
[edited by: tedster at 9:44 am (utc) on Sep. 20, 2006]
Every page has a unique title
Every page has a unique description
Every page has unique mata keywords
Every page has unique content
Every page is static html
All WWW is redirected to non-www ( I do this for most of my sites)
Every page even has a unique menu
Every page has the same H1 tag at the top with the company name?!?!?!?
mmmmm... any comments? I seem to be losing a page every day. pages lost are not going supplemental, they are just not showing up for a site:mydomain.com search.
Do I go redirect FROM www.mysite.com TO mysite.comOR do I redirect FROM mysite.com TO www.mysite.com
Cut out the middle man and go straight to Matt Cutts
[mattcutts.com...]
Can any of the affected sites try this command in google and give feedback on the results you see. Search Google
"www.yourdomain.com */"
look for especially for domains starting with numeric sub domains within the results.
Be interested in any feedback.
Found alot of scrapers of course (%*!@#%!)
But I also found LOTS of links to us that do not show in the link: command.
-- added
I just don't understand why Google cannot identify these scrapers.. most of the ones I've seen stealing from our site are all using the same template.
[edited by: Bewenched at 3:54 pm (utc) on Sep. 20, 2006]
Hmm, difficult choice.
- Which root index page is listed with higher PR; www or non-www?
- Is your root index page listed at just ....domain.com/index.html or just ....domain.com/ only, or at both?
- Which has more pages in the Normal Index; www or non-www URLs?
- Which domain has the most incoming links; www or non-www site?
.
You want to go with the one that gives Google the least work to correct their listings.
You also need to make sure that as well as the redirect, that internal links also specify the correct domain. There is no point in redirecting non-www to www if internal links continue to point to non-www URLs! Those internal links need to point to www URLs too. When linking to index pages, omit the index page filename from the link; end the URL with the / only.
<continued here: http://www.webmasterworld.com/google/3092370.htm [webmasterworld.com]>
[edited by: tedster at 12:49 am (utc) on Sep. 22, 2006]