Tag Archive | "constancia de cuil"

cuil the next google?

Tags: , , , , ,


Then start-search engine “Google will be next?” Many have wondered, and today the launch of Cuil (pronounced “cool”) May the best test case, because Google itself further on the search engines. Cuil sees what seems a Web index, a single screen, the presentation and the spring in a time where people willing to consider an “underdog”. The key issues are now, as the relevance to sit? And perhaps word from the mouth still really important part of building [Note: The website was designed to live Cuil to search at 9:01 pm hour from the Pacific at the 27th July, but on this day, I'm always to see only one operation. I would expect that this change fairly quickly].
Why Care About Cuil?

There is no end of companies which have tried to take Google as an objective of the research. Earlier this year, my Google account Challengers: 2008 edition article discusses some of these Hakia as Mahalo search and Wikia. You can use this list to other companies such as Gigaweb and Exalead. None of them have a gap in Google.

For the established players Yahoo, Microsoft and Ask.com - down, all in the field of quality of research - Google is not pressed. So what Cuil deserves special attention?

For a Cuil has an impressive pedigree with its three founders: Tom Costello WebFountain project of IBM, and Anna Russell Patterson and energy Tera Google project by Google, Google’s search index massive. Cuil former founder of AltaVista Louis Monier - later, eBay, and this means that Google - as part of the team.

These people know. In particular, they know on the firing line, heavy metals, industrial research. Not only that, they trigger what seems a complete service that everyone can use. Because Google already has a blog in response to its size and Cuil claims on Friday, even before they started Cuil these requests or made publicly available. If Google is paying much attention, then you need someone.

What does Cuil

There are four main areas, which is Cuil of other services. These are:

* Big Web
* Algorithm unique relevance
* Unique display of results
* Privacy

I go diving in each of these areas in depth, the significance of them and zergliedern some misunderstandings and public relations spin they have.

Size war?

Cuil claims that the largest index of the Web, 120 billion pages indexed (with a total of 186 billion to their caterpillars, spam and content, among other things, double, which is indexed). In an interview with them that they were Cuil three times the size of Google. Sons pretty cool, is not it?

Sigh. Yes, the size. You want a comprehensive collection of documents on the Web. However, a large number of documents does not mean that the greatest importance. As I already wrote in September 2005, when Google famous, the number of documents, which he listed:

In the last century, in December 1995, to be precise, Alta Vista explode on stage search engine, what was at that time a giant of 21 million pages, well over the competitors were from 1 to 2 million range. The Web is becoming faster, and the largest number of pages that you have, the more likely you will really find that needle in a bundle of hay. Even bigger, to a certain extent, means better.

This fact was not lost on the PR people. For big games appeared more seriously begun. Lycos speak to the number of pages that he “knew”, even if it is not indexed, or in any way accessible on the Internet through its search engine. This irritation search engine Excite, he has a page on how the URL, as you see here archived.

First DID medium size is good that soon disappeared when the extent of the indices is to be expected, millions of pages with dozens of millions of people. Bigger is not better, because many questions, you could overwhelmed with matches.

I played long with the needle-in-the-Bund hay metaphor for this state. You can find the needle? You need the whole bundle of hay, size of the promoters say. But if I discharge all the Confederation of hay on his head, you can find the needle? The mere fact that we no longer big enough.

That is why I and others have said, not the size, so as long as 1997 and 1998. Bigger better not to hear, regardless of the size of the many wars that continue to explode. Remember, Google - as he came to the People’s Party in 1998 and 1999 - was one of the smaller search engines in the area from 20 to 85 million pages. Despite the absence of supposedly completeness, it grew and grew on the quality of results.

Why wars are still big? The search engines have an indication of size, as an ad quickly, efficiently, how to get the impression that they were more relevant. Instead of a number of relevance, the figures could be trotted and the search engine with the largest bar on the map wins!

Given this history, see Cuil trotting dimensions of figures is very discouraging and not back, not before. Time spent better on other things (such as measuring the relevance of results) and not, by those who try to count the pages. Without queries and try to make the comparison, I have already Cuil with his allegations. For example:

* Cuil has told us that Google at 40 billion documents. After? After what Cuil has heard that the journalists have said they hear from Google. OK, I speak as Google and the journalists who regularly. I have never heard this number on the market. Cuil after the first, to speak with them that the comparison between tests are of the opinion that Google has not increased.

* Yahoo, says that to 20 billion U.S. dollars. Cuil said, it is based on Yahoo, where it has been said in 2005 with the assumption that if they had more, they proclaimed. Bad might indicate that since 2005 the size relaxation gave Google and Yahoo-size figures speak.

* Microsoft has, they say, to 12 billion U.S. dollars. In fact, Microsoft has said that he set at 20 billion last September - but if this figure is difficult to Cuil not be used, then you start to doubt on the other those who are. In a follow-examination, Cuil said in its opinion, Microsoft went to a lesser index of 12 billion U.S. dollars, based on his tests.

We can also begin to test in the short term, however. It is sufficient to start a query, see what Google points out that the figure for himself, and then do the same in Cuil. If Cuil regular reports more, to win. Or not. That is what the people mainly begun mass in size over the last battle between Google and Yahoo, and then questions about a dual content and spam starting point.

On the assumption that you go beyond any advantage to the size Cuil before the time is short, if they are a major problem. Google simply more documents to explore and to ensure that any Cuil is Google +1.

We have Cuil about why Google has not only the requirement.

“If they want to triple the size of their index, they would have to triple the size of each server and the group. It is not easy and fast,” said Patterson.

In a follow-examination, Cuil added that Google is bigger, that it considers that it is now largely the work of Patterson to Google, and since they are no longer there, more and more of the index size is a “non-trivial “Exercise.

Maybe. And perhaps the infrastructure built Cuil any easier for them to lower cost indexing of documents on the Web that Google. But Google has a lot of money and the engineering of his own experience. It is illusory to believe that they had not before, what could be a weakness. They responded to Yahoo in 2005; Cuil do with it. And for what? Even if Cuil is greater than Google, that does not mean Cuil more relevant. No longer that it does not mean the inclusion of several documents in a “I am more than you” game might improve the state of research in general.

Unfortunately, Google has begun to respond Cuil the application before that Cuil. In one on Friday, Google is to decide that the time was on their “knew” Article 1 billion on the Web. The confused some people to think Google has indexed 1 trillion documents, even if they do not say this. What Google does not say that this is obvious:

We are not the index for each of those billion pages - many of them resemble each other, or automatically generated content is similar calendar, which is not very useful for users. But we are proud that the comprehensive index for each search engine, and our aim was always to the index in the world.

My answer to Google - and Cuil - all search engine that attempts to address the size of the battle is what I said on Friday:

There is no precise answer to one side is useful - and in turn there is no precise answer to the “more” of them collected. Tell me, you have a good piece of the canvas, and I will be good. But if Google or other search engines to do the claims size, my hackles go well. He is better.

As a point, a question with a large index is cool. Cuil said that operating from 1 to 1.5 billion pages per day, which means that 3 months to catch up everything they currently have spidered. However, some important sites are well researched, on a weekly basis, they said. It is good - but Google has pages that can be added, almost in real time thanks to the instant shift.

So Long, PageRank?

Cuil created a large class, that the content pages than popularity. The idea here is the handheld in the way Google is rightly regarded as a reward pages that are most affected by PageRank value.

The problem is that the PageRank is only part of the class, such as Google pages. It looks like a large number of other factors, so that the rank is not a popularity competition (see “What is Google PageRank? A guide for researchers and webmasters to learn more about this topic).

The other issue is that despite the amount PR, Cuil is, in fact, with popularity, the results that I can say.

For example, a search of [Harry Potter] Harry Potter and the Order of the Phoenix film Web site on Cuil comes first. This is from thousands of pages. How the hell can Cuil to know exactly the content of the page itself that the movie Web site should be included in the initial results, particularly in the Web environment where people can (and) the habit, content, which may be of Algorithms research?

The answer is analysis of the relationship - counting the connections and see that this is indeed so. The rotation is made by the connections to measure the relevant pages of what someone research.

Return to [harry potter]. If you are not only Cuil, it will find all the pages, in his view, in connection with these two words. This means that the pages that these words, so that the pages, on the other they are words like “Harry Potter” or “gryffindor.” The figures for these relations, they see what kind of words throughout the pages that are there. Since “gryffindor” often appears on pages that also say “Harry Potter”, he can say those two words (finally, three words - but two conditions of the application) are connected.

Cuil then examines the whole, to see what pages are with them. Those who many important links that it could be, or it better. Since the Harry Potter movie site has many links that he, he comes to the highest results. Cuil even has a name for this - Idea panel.

If that sounds to some, because this combination of aroma analysis was known by Teoma, was later acquired by ASK. If Teoma seems that he has tried to distinguish between itself against Google, saying only he has the “specific issues” with respect to the ranking to be done. It is always played ask today:

Our experts rank algorithm goes beyond the simple link popularity (class, the sides on the basis of the amount of links to a page in particular) to determine the popularity of pages to be experts on the topic of your search. This is known as the domain-specific popularity. Of the issues identified (also known as “clusters”) who are experts on these topics, and the popularity of millions of pages between these experts - the exact time of your query is being carried out - requires many other calculations that are used by other search engines are not realized. The result is known in the world-class importance that often provides a unique editorial flavor compared to other search engines.

Particularly, despite the sale will improve the analysis, it is never trounced Google. In addition, there are many who set out - including myself - that Google is not itself an analysis of specific topics.

So, the ranking depending on the content Cuil torsion? As I have already said that the rotation on the ground. But the analysis of the content is determined by other means, as I also in others.

Rate this:
2.5

Business Links

Visit Our Sponors

Awards

My site was nominated for Best Business Blog!

Site Sponsors

Entrecard Sponsor

ss_blog_claim=9008d4f6dcbfa236f9fe9c29b1c1b23d


Business Blogs - BlogCatalog Blog Directory
Business & Finance - Top Blogs Philippines
TopOfBlogs
Display Pagerank