Forum Moderators: phranque

Message Too Old, No Replies

Search Engine Research

Some Interesting Results

         

JohnKing1

10:19 am on Jan 21, 2009 (gmt 0)

10+ Year Member



The following thesis "Search Engine Content Analysis" is the result of my PhD studies. I hope that you find it interesting.

[fileshare.qut.edu.au...]

Note: I do not get any hits from the above URL, it points directly to my university fileshare application.

[edited by: JohnKing1 at 11:06 am (utc) on Jan. 21, 2009]

Brett_Tabke

4:07 pm on Jan 21, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



Very interesting work, that I don't think any one has ever published before.

Can you do an "executive summary" of it for those that are not inclined to dig that far into it?

Yoshimi

4:08 pm on Jan 21, 2009 (gmt 0)

10+ Year Member



Oh that would be good, I got lost somewhere around the abstract :)

JohnKing1

5:49 am on Jan 22, 2009 (gmt 0)

10+ Year Member



I will try to explain it in simple English.

Note that if you want to skip the detailed explanation and go straight to the interesting section in the thesis then read from pages 115 to 132. These contain the results of the research in graphic format.

*) Search engines allow information on almost any subject to be quickly and easily retrieved.

*) As increasingly more material becomes available electronically the influence of search engines on our lives will continue to grow.

*) There are many thousands of specialist search engines on the web.

*) It is difficult to know what information each search engine contains, what bias each search engine has and how to select the best search engine for an information need.

*) This research introduces a new method for classifying search engines. A large knowledge base is created and then mined in order to find search engine classification rules.

*) These rules are used to perform an extensive analysis of the content of the largest general purpose Internet search engines in use today. These search engines include Google, Yahoo and Microsoft Live. Several specialist search engines such as PubMed, U.S. Department of Commerce and the U.S Department of Agriculture were included as well.

(There is a lot more to the research, but you will have to read the entire thesis to find out more.)

Yoshimi

8:20 am on Jan 22, 2009 (gmt 0)

10+ Year Member



OK, that makes sense, but what is the benefit going to be of doing the classification, the "what's in it for me" factor? That's probably why I got lost in the abstract, I still can't see the purpose or the application of the research.

phranque

8:26 am on Jan 22, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



in even simpler terms, you could say that this research forms the basis for developing a search engine search engine.

Death of the Man

7:42 pm on Jan 22, 2009 (gmt 0)

10+ Year Member



But who will search the search engine search engines?

2clean

8:18 am on Jan 23, 2009 (gmt 0)

10+ Year Member



a search engine?

Yoshimi

8:27 am on Jan 23, 2009 (gmt 0)

10+ Year Member



in even simpler terms, you could say that this research forms the basis for developing a search engine search engine.

Yep I got that, but why, what is the benefit, what use would it be put to?

Death of the Man

9:03 am on Jan 23, 2009 (gmt 0)

10+ Year Member



But who will search the search engine search engines?

a search engine?

And where does it end?

Death of the Man

9:06 am on Jan 23, 2009 (gmt 0)

10+ Year Member




But who will search the search engine search engines?
a search engine?

And where does it end?

Actually, I think Individualized Search Results is already answering this question, though it be in its infancy.

phranque

11:13 am on Jan 23, 2009 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



what is the benefit, what use would it be put to?

it could eventually level the playing field for smaller search engines.

suppose you were looking for a video describing a certain technique for dealing with the dough for creating a particular regional dish:
New Search Paradigm - Shifting from Text to Video [webmasterworld.com]

a search engine search engine could tell you if the answer was best found on a general purpose search engine or perhaps a video search engine which are both obvious choices to most of us.

but it could also tell you about a cooking or food-specific search engine you had no idea existed that has terabytes of information specific to dough handling techniques.
and it might also tell you about a culturally specific search engine of which you were also unaware that could tell you why this technique was developed given the technology available or how the local agriculture affected the wheat used for this particular type of dough.

or you could try googling "i need dough" with your individualized search results turned on...

daveVk

11:20 am on Jan 23, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Meta search engines such as dogpile, should be interested in this, if they can make an intelligent choice of where to search, and how to weight results, there results should improve.

Yoshimi

11:28 am on Jan 23, 2009 (gmt 0)

10+ Year Member



Thanks for that Phranque.

HugeNerd

2:53 pm on Jan 23, 2009 (gmt 0)

10+ Year Member



What is the working definition of 'search engine' (I understand that traditional search engines and spidering search engines are defined in the paper; I am looking for an overarching, blanket definition...a point where we say, this is a search engine vs. this is a search function)?
Does my site's internal search qualify? Does everyone running a [Mini] Google Appliance count as a part of Google? Has Google superceded 'search engine' and moved to 'search network'?

JohnKing1

5:19 pm on Jan 23, 2009 (gmt 0)

10+ Year Member



That is difficult to define.

A motivation for my research is that there are large portions of the web that are only accessible through a search interface. For example, many academic publication providers are not crawled by Google as they do not provide hyperlinks into their content that can easily be followed by a spider. The results of these forms are rich in high quality content but they are effectively invisible to the average Google searcher. I don't know if they should be called search engines or search functions but they do provide access to information that would otherwise not be found. (This portion of the web that is not accessible by spiders is called the "Deep Web", and it is estimated to be many times larger than the "Surface Web".)

[edited by: JohnKing1 at 5:21 pm (utc) on Jan. 23, 2009]

HugeNerd

7:34 pm on Jan 23, 2009 (gmt 0)

10+ Year Member



Thank you! I'll just keep reading and drilling down!