9238论搜索引擎的评测方法9238 on the evaluation method ofsearch engine
A long time ago, the search engine is not like today' s Allflowers bloom together. requirements of the people, low, aslong as it can be put on the Internet related website search,search to the site as a little more, the web site has a littleless able to meet. So at that time, the way people evaluate thesearch engine is to use a few keywords, test and compare theirsearch speed, search amount and the number of unrelatedwebsites. In short, it' s all inclusive. At that time, the searchengine technology is not very different, so this evaluationmethod is feasible.
Since then, the unique search engine technology has emerged oneafter another, and now it is obviously in the Warring Statesperiod. However, people' s evaluation methods have not changedmuch, and now the common evaluation is simply using severalkeywords to compare the search speed, the number of searchresults and the accuracy of their respective search.
Far not said that in the first quarter of 2001 after the upgradeof the askjeeves, you can play like as a phone in any phone inthe hands of askjeeves phone number, can also be labeled on thepage to enter the online about online speech, using yourcomputer' s microphone and speakers to communicate. Then youjust orally to it a request, it will put your voice into text,and then analyze your request to 7 million standard answers,it and other 2 million multimedia repository and Internet tofind the answer, find and then converted into voice to answeryou.
Imagine, if you ask, "the recent American election is pending,what do Americans think?"" After a while, the computer ortelephone to answer you: "according to the latest survey, thelast is if Bush is elected, 80% of Americans will accept himas the legitimate president, if Gore is finally elected, 79%of Americans will accept him as the legitimate president. " Ifyou ask, "who scored in the last World Cup finals?" "It answersyour name as well as the audio and video clips of the final goalfor you to enjoy. (of course, the audio video clips are basedon the fact that you're not using the phone, but the computer) .Although, askjeeves think their speech and search speed has tobe the degree of commercialization, but it will still have manyimmature, if you take a few keywords to test its search speedand precision and recall, and many of the common search engine,it came in where? If it' s behind you, is it a lousy searchengine?
One is evaluatingthe Internet searchengine is averydifficultthing, but a lot of evaluation results are ordinary Internetusers to see, is bound to take the Yahoo, include Sina portal,for them, is just a part of the Internet search, other kindsof search how to do? If you don't count, but the net civilianmuch; if it is, is a mess, where to?
Here, let' s analyze the capabilities flaws of several importantevaluation elements:
I. recall
Since it' s a search engine, first of all, it' s amatter of course,and if that fails, it doesn't seem to be necessary. Because thenumber of included pages each search engine announced, can thewhole letter, with a keyword search results is obviously, sothe general evaluation on this subject.
But to this date there are still many problems, most decentpoint of the search engine I can find a number of keywords toprove its search results is the most complete. Because althoughthe number of pages indexed in size, but the robot and spiderprogram, index scope and index standards are not the same, thebiggest search engine to be much smaller in the search engineto search.
Some search engines support "about", "of", "ah", and so on Whichevaluation mentioned?
In addition to the content is difficult to choose, the lengthis not good.
Some search engines do not support single Chinese charactersearch, how do you count it? Generally only a single keywordsearch, and multi keyword search it?How long is the search forlong sentences? Even search engines can support any articlesor fragments as keywords, so compare the results of the keywordsearch is not the same, not to mention the function of nocomparison. The semantic search engines like excite, as wellas the engines that support fuzzy search, and other searchengines that search for very few or even zero keywords, can finda whole bunch of results, and how do they compare?
Finally, the search engine can optimize the results forspecific keywords, and who will ensure the fairness of theevaluation? If one of the evaluated engines knows the keywordsin advance, then the champion is the only one that can be easilyoptimized.
Two: search speed
Recall ratio is faster than the search speed, if there aresearch engine index page is more, but search for a second fiveor six seconds or longer, directly ask it out, there is nomeaning than going down.
The problem of speed is at first in keyword, single keywordsearch is not fast, multi keyword search fast.
Then there is the problemof access, which is unfair for a searchengine with more than one hundred million of daily visits anda search engine with tens of thousands of visits per day.And the number of pages indexed, a search engine index 1 billion", another search engine index ten million", let them on thesame keywords in the database search results than the searchspeed, so how to convince people?
In addition to optimization problems, some search engines havethe memory search results accelerate the ability to transfera keyword, even the first word search took 10 seconds, secondsearch may be 2 seconds, third times, fourth times, when yougo to the test has always been 0.0001 seconds. So, if you choosea common word test, it' s amazing, if you come to a remote word,
maybe you can't get out of it for a long time. What keywordsshould you choose?How much do you usually use?This is reallya silly sum.
Search engines are not on the local machine in the lab, but forordinary users, so the search time should also include thesearch interface and search results of the transmissionprocess.A search engine took 0.0001 seconds, but it took 3 seconds toget the page, another search took 0.5 seconds, but it took asecond to send the page. Which search engine would you say isfaster? When you really use, you choose that 3.0001 secondslater to see the search results or 1.5 seconds later to see thesearch results?
Three: precision
This is very important, and the search is fast and fast, butthe result youwant doesn't knowhowmany pages youwant to find.What' s the result of this search? This kind of search engineis only useful when searching for rare things, but to searchfor rare things, you should go to the meta search engine. Whyuse it? The evaluation criteria of precision are difficult todetermine, and it depends on what you check. You have to lookfor a specific website and find a similar website. The key toprecision ratio is to search what and what keyword to choose,the judge can decide at random, and then affect the reliabilityof the evaluation results.
Four: dead link
General search engines have some search results that don' t go
anywhere, less than one percent, two, and eight or nine, andthis is often used as one of the evaluation criteria. But asGoogle uses web snapshot, there is almost no dead link problem,and even if the site in the search results is closed, you canstill see the web page that Google stores itself. How do youcalculate this kind of dead link?
Five: user burden
I haven' t seen anyone who has ever used this search engine inChina, but it' s an important factor in evaluating the pros andcons of search engines, including many aspects. Search enginesare for human use,
Make sure that people are comfortable, convenient, and quick,and that any user who hinders and delays the user' s access tothe final search results is charged by the user.
The first is the search interface, a pure search engineinterface with a search box, compared with a portal with adsand a large number of web pages, and their search burden forusers is high.
The second is to describe the search results, search resultspage description of the text is long or short, "the textdescription index with keyword part or the beginning of indexedpages indexed pages or a few lines of the main content, keywordsare highlighted by what color is not displayed page address,and the searchresults page layout, the the user' s searchburdenthere is a big difference.
Effect of addition is the user steps, whether can use the mouseto start the search, the search results page shows the numberis only 10, page convenient or not, the search box is two ora, above or below, a search keyword search is still displaycable box, every one of thesewill affect the search efficiency.Six: there are other
Do you want to search in this directory?,
Internet Index database update time,
Stability of search engines,
The ability to support advanced search should also beevaluated.
A person is not considerate, there may be other importantevaluation elements I did not mention, if you want to, hope toinform. See here, everyone on the limitations of the currentevaluation methods commonly used search engine must understand,of course, the most ridiculous is that I do not know is ignorantor tricky or special selection criteria, some Chinese searchengine evaluation this year to do not even include Google, aswell as a long list of celebrities can row the violin missedPaganini.
It' s really hard to evaluate a search engine.
profitserver正在对德国vps(法兰克福)、西班牙vps(马德里)、荷兰vps(杜廷赫姆)这3处数据中心内的VPS进行5折优惠促销。所有VPS基于KVM虚拟,纯SSD阵列,自带一个IPv4,不限制流量,在后台支持自定义ISO文件,方便大家折腾!此外还有以下数据中心:俄罗斯(多机房)、捷克、保加利亚、立陶宛、新加坡、美国(洛杉矶、锡考克斯、迈阿密)、瑞士、波兰、乌克兰,VPS和前面的一样性...
阿里云(aliyun)在这个月又推出了一个金秋上云季活动,到9月30日前,每天两场秒杀活动,包括轻量应用服务器、云服务器、云数据库、短信包、存储包、CDN流量包等等产品,其中Aliyun轻量云服务器最低60元/年起,还可以99元续费3次!活动针对新用户和没有购买过他们的产品的老用户均可参与,每人限购1件。关于阿里云不用多说了,国内首屈一指的云服务器商家,无论建站还是学习都是相当靠谱的。活动地址:h...
趣米云早期为做技术起家,为3家IDC提供技术服务2年多,目前商家在售的服务有香港vps、香港独立服务器、香港站群服务器等,线路方面都是目前最优质的CN2,直连大陆,延时非常低,适合做站,目前商家正在做七月优惠活动,VPS低至18元,价格算是比较便宜的了。趣米云vps优惠套餐:KVM虚拟架构,香港沙田机房,线路采用三网(电信,联通,移动)回程电信cn2、cn2 gia优质网络,延迟低,速度快。自行封...