Forum Moderators: bakedjake

Message Too Old, No Replies

Whole Yandex Git repository leaked

         

brotherhood of LAN

12:18 pm on Jan 26, 2023 (gmt 0)

martinibuster

2:22 am on Jan 27, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



What is the significance of that? I read the thread but I didn't really get a sense of exactly what kind of data is involved and what the significance to people may be.

phranque

3:00 am on Jan 27, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



the leak includes a snapshot of the repository of source code for all major Yandex services.
the significance is this it is (was) proprietary code.

it is the russian equivalent of a leak of the codebase for Google Search, googlebot, Gmail, Google Maps, Google Analytics, Google Ads, Adwords, etc.

not2easy

3:04 am on Jan 27, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Ouch!

lucy24

4:19 am on Jan 27, 2023 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



:: immediate mental picture of other search engines scrambling to snoop into the code, followed by vague mental association with mid-century engineers studying defector’s plane “Why on earth did they do it like that?” and “Oh willya look at that, they used aluminum!” and “Darn, now that’s ingenious” ::

mack

5:44 pm on Jan 27, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



One major problem with the leaking of closed source code is the bag guys can now get a peek behind the UI to see what's going on in the background. It's way easier to find exploits if you have the codebase available.

Mack.

Sgt_Kickaxe

2:44 am on Jan 28, 2023 (gmt 0)



What is the significance of that? I read the thread but I didn't really get a sense of exactly what kind of data is involved and what the significance to people may be. ~ martinibuster

[searchengineland.com...]

The article emphasizes that Yandex is not Google but, personally, I don't care which successful search engine the list of 1,922 ranking factors belongs to. Each one of them should be checked to see if A) you can measure this on a website and B) it is a plausible factor.

Google makes us test everything else, so why not try something new, lol. A quick look at the list, most of it is nothing new, but some looks interesting.

There is a huge discussion on Twitter tonight about the factors themselves, link in the searchengineland article above, if you're interested.

Sgt_Kickaxe

3:21 am on Jan 28, 2023 (gmt 0)



Also, just an observation. FEW seem interested in using the leaked code to get better Yandex rankings, and almost all code discussion, where you'd expect to find it, is to see if it works on Google.

A bit of a backfire if the leak was to take down Yandex, lol. These are just signals, the ranking comes from the data collected, so I'd expect Yandex to simply pause collecting site metrics from most sites until they have a game plan.

This is the type of thing you'd assume is a bad thing but turns out to be a good one as it may force the engine to change too.

phranque

11:40 pm on Jan 30, 2023 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Michael King has published an excellent initial analysis of the leaked codebase and it is required reading for anyone interested in learning details:
So, while Yandex is certainly not Google, it’s also not some random research project that we’re talking about here. There is a lot we can learn about how a modern search engine is built from reviewing this codebase.

At the very least, we can disabuse ourselves of some obsolete notions that still permeate SEO tools like text-to-code ratios and W3C compliance or the general belief that Google’s 200 signals are simply 200 individual on and off-page features rather than classes of composite factors that potentially use thousands of individual measures.

Yandex scrapes Google and other SEO learnings from the source code leak [searchengineland.com]