- Categorize paginated
documents. [launch codename "Xirtam3",
project codename "CategorizePaginatedDocuments"] Sometimes,
search results can be dominated by documents from a paginated series. This
change helps surface more diverse results in such cases.
- More language-relevant
navigational results. [launch codename "Raquel"] For
navigational searches when the user types in a web address, such as
[bol.com], we generally try to rank that web address at the top. However,
this isn’t always the best answer. For example, bol.com is a Dutch page,
but many users are actually searching in Portuguese and are looking for
the Brazilian email service, http://www.bol.uol.com.br/. This change takes
into account language to help return the most relevant navigational
results.
- Country identification for
webpages. [launch codename "sudoku"] Location
is an important signal we use to surface content more relevant to a
particular country. For a while we’ve had systems designed to detect when
a website, subdomain, or directory is relevant to a set of countries. This
change extends the granularity of those systems to the page level for
sites that host user generated content, meaning that some pages on a
particular site can be considered relevant to France, while others might
be considered relevant to Spain.
- Anchors bug fix. [launch codename
"Organochloride", project codename "Anchors"] This
change fixed a bug related to our handling of anchors.
- More domain diversity. [launch codename
"Horde", project codename "Domain Crowding"] Sometimes
search returns too many results from the same domain. This change helps
surface content from a more diverse set of domains.
- More local sites from organizations. [project codename
"ImpOrgMap2"] This change makes it more likely you’ll find an
organization website from your country (e.g. mexico.cnn.com for Mexico
rather than cnn.com).
- Improvements to local
navigational searches. [launch codename "onebar-l"] For
searches that include location terms, e.g. [dunston mint seattle] or [Vaso Azzurro Restaurant 94043], we are more
likely to rank the local navigational homepages in the top position, even
in cases where the navigational page does not mention the location.
- Improvements to how search
terms are scored in ranking. [launch codename "Bi02sw41"]
One of the most fundamental signals used in search is whether and how your
search terms appear on the pages you’re searching. This change improves
the way those terms are scored.
- Disable salience in
snippets. [launch
codename "DSS", project codename "Snippets"] This
change updates our system for generating snippets to keep it consistent
with other infrastructure improvements. It also simplifies and increases
consistency in the snippet generation process.
- More text from the beginning
of the page in snippets. [launch codename "solar", project
codename "Snippets"] This change makes it more likely we’ll show
text from the beginning of a page in snippets when that text is
particularly relevant.
- Smoother ranking changes for
fresh results. [launch codename "sep", project
codename "Freshness"] We want to help you find the freshest
results, particularly for searches with important new web content, such as
breaking news topics. We try to promote content that appears to be fresh.
This change applies a more granular classifier, leading to more nuanced
changes in ranking based on freshness.
- Improvement in a freshness
signal. [launch codename "citron", project
codename "Freshness"] This change is a minor improvement to one
of the freshness signals which helps to better identify fresh documents.
- No freshness boost for
low-quality content. [launch codename “NoRot”, project codename
“Freshness”] We have modified a classifier we use to promote fresh content
to exclude fresh content identified as particularly low-quality.
- Tweak to trigger behavior
for Instant Previews. This change narrows the trigger area for
Instant Previews so that you won’t see
a preview until you hover and pause over the icon to the right of each
search result. In the past the feature would trigger if you moused into a
larger button area.
- Sunrise and sunset search
feature internationalization. [project codename
"sunrise-i18n"] We’ve internationalized the sunrise and sunset search feature to 33
new languages, so now you can more easily plan an evening jog before dusk
or set your alarm clock to watch the sunrise with a friend.
- Improvements to currency
conversion search feature in Turkish. [launch codename
"kur", project codename "kur"] We launched
improvements to the currency conversion search feature in Turkish. Try
searching for [dolar kuru], [euro ne kadar], or [avro kaç para].
- Improvements to news
clustering for Serbian. [launch codename "serbian-5"] For
news results, we generally try to cluster articles about the same story
into groups. This change improves clustering in Serbian by better grouping
articles written in Cyrillic and Latin. We also improved our use of
“stemming” — a technique that relies on the “stem” or root of a word.
- Better query interpretation. This launch helps us
better interpret the likely intention of your search query as suggested by
your last few searches.
- News universal results
serving improvements. [launch codename "inhale"]
This change streamlines the serving of news results on Google by shifting
to a more unified system architecture.
- UI improvements for breaking
news topics. [launch codename "Smoothie", project
codename "Smoothie"] We’ve improved the user interface for news
results when you’re searching for a breaking news topic. You’ll often see
a large image thumbnail alongside two fresh news results.
- More comprehensive
predictions for local queries. [project codename
"Autocomplete"] This change improves the comprehensiveness of
autocomplete predictions by expanding coverage for long-tail U.S. local
search queries such as addresses or small businesses.
- Improvements to triggering
of public data search feature. [launch codename
"Plunge_Local", project codename "DIVE"] This launch
improves triggering for the public data search feature, broadening the
range of queries that will return helpful population and unemployment
data.
- Adding Japanese and Korean
to error page classifier. [launch codename "maniac4jars",
project codename "Soft404"] We have signals designed to detect
crypto 404 pages (also known as “soft 404s”), pages that return valid text
to a browser, but the text only contains error messages, such as “Page not
found.” It’s rare that a user will be looking for such a page, so it’s
important we be able to detect them. This change extends a particular
classifier to Japanese and Korean.
- More efficient generation of
alternative titles. [launch codename "HalfMarathon"] We
use a variety of signals to generate titles in search results. This change
makes the process more efficient, saving tremendous CPU resources without
degrading quality.
- More concise and/or
informative titles. [launch codename "kebmo"] We look at
a number of factors when deciding what to show for the title of a search
result. This change means you’ll find more informative titles and/or more
concise titles with the same information.
- Fewer bad spell corrections
internationally. [launch codename "Potage",
project codename "Spelling"] When you search for [mango tea], we
don’t want to show spelling predictions like “Did you mean ‘mint tea’?” We
have algorithms designed to prevent these “bad spell corrections” and this
change internationalizes one of those algorithms.
- More spelling corrections
globally and in more languages. [launch codename "pita",
project codename "Autocomplete"] Sometimes autocomplete will
correct your spelling before you’ve finished typing. We’ve been offering
advanced spelling corrections in English, and recently we extended the
comprehensiveness of this feature to cover more than 60 languages.
- More spell corrections for
long queries. [launch codename
"caterpillar_new", project codename "Spelling"] We
rolled out a change making it more likely that your query will get a spell
correction even if it’s longer than ten terms. You can watch uncut footage of when we decided to
launch this from our past blog post.
- More comprehensive
triggering of “showing results for” goes international. [launch codename
"ifprdym", project codename "Spelling"] In some cases
when you’ve misspelled a search, say [pnumatic], the results you find will
actually be results for the corrected query, “pneumatic.” In the past, we
haven’t always provided the explicit user interface to say, “Showing
results for pneumatic” and the option to “Search instead for pnumatic.” We
recently started showing the explicit “Showing results for” interface more
often in these cases in English, and now we’re expanding that to new languages.
- “Did you mean” suppression
goes international. [launch codename "idymsup",
project codename "Spelling"] Sometimes the “Did you mean?”
spelling feature predicts spelling corrections that are accurate, but
wouldn’t actually be helpful if clicked. For example, the results for the
predicted correction of your search may be nearly identical to the results
for your original search. In these cases, inviting you to refine your
search isn’t helpful. This change first checks a spell prediction to see
if it’s useful before presenting it to the user. This algorithm was
already rolled out in English, but now we’ve expanded to new languages.
- Spelling model refresh and
quality improvements. We’ve refreshed spelling models and launched
quality improvements in 27 languages.
- Fewer autocomplete
predictions leading to low-quality results. [launch codename
"Queens5", project codename "Autocomplete"] We’ve
rolled out a change designed to show fewer autocomplete predictions
leading to low-quality results.
- Improvements to SafeSearch
for videos and images. [project codename
"SafeSearch"] We’ve made improvements to our SafeSearch signals
in videos and images mode, making it less likely you’ll see adult content
when you aren’t looking for it.
- Improved SafeSearch models. [launch codename
"Squeezie", project codename "SafeSearch"] This change
improves our classifier used to categorize pages for SafeSearch in 40+
languages.
- Improvements to SafeSearch
signals in Russian. [project codename
"SafeSearch"] This change makes it less likely that you’ll see
adult content in Russian when you aren’t looking for it.
- Increase base index size by
15%. [project codename "Indexing"] The
base search index is our main index for serving search results and every
query that comes into Google is matched against this index. This change
increases the number of documents served by that index by 15%. *Note:
We’re constantly tuning the size of our different indexes and changes may
not always appear in these blog posts.
- New index tier. [launch codename
"cantina", project codename "Indexing"] We keep our
index in “tiers” where different documents are indexed at different rates
depending on how relevant they are likely to be to users. This month we
introduced an additional indexing tier to support continued comprehensiveness
in search results.
- Backend improvements in
serving. [launch codename "Hedges", project
codename "Benson"] We’ve rolled out some
improvements to our serving systems making them less computationally
expensive and massively simplifying code.
- “Sub-sitelinks” in expanded
sitelinks. [launch codename "thanksgiving"]
This improvement digs deeper into megasitelinks by showing sub-sitelinks
instead of the normal snippet.
- Better ranking of expanded
sitelinks. [project codename
"Megasitelinks"] This change improves the ranking of
megasitelinks by providing a minimum score for the sitelink based on a
score for the same URL used in general ranking.
- Sitelinks data
refresh. [launch codename "Saralee-76"]
Sitelinks (the links that appear beneath some search results and link
deeper into the site) are generated in part by an offline process that
analyzes site structure and other data to determine the most relevant
links to show users. We’ve recently updated the data through our offline
process. These updates happen frequently (on the order of weeks).
- Less snippet duplication in
expanded sitelinks. [project codename
"Megasitelinks"] We’ve adopted a new technique to reduce
duplication in the snippets of expanded sitelinks.
- Movie showtimes search
feature for mobile in China, Korea and Japan. We’ve expanded our
movie showtimes feature for mobile to China, Korea and Japan.
- No freshness boost for low
quality sites. [launch codename “NoRot”, project codename
“Freshness”] We’ve modified a classifier we use to promote fresh content
to exclude sites identified as particularly low-quality.
- MLB search feature. [launch codename
"BallFour", project codename "Live Results"] As the
MLB season began, we rolled out a new MLB search feature. Try searching
for [sf giants score] or [mlb
scores].
- Spanish football (La Liga)
search feature. This feature provides scores and information
about teams playing in La Liga. Try searching for [barcelona fc] or [la
liga].
- Formula 1 racing search
feature. [launch codename "CheckeredFlag"]
This month we introduced a new search feature to help you find Formula 1
leaderboards and results. Try searching [formula
1] or [mark webber].
- Tweaks to NHL search
feature. We’ve improved the NHL search feature so it’s
more likely to appear when relevant. Try searching for [nhl
scores] or [capitals score].
- Keyword stuffing classifier
improvement. [project codename "Spam"] We have
classifiers designed to detect when a website is keyword stuffing. This change made the
keyword stuffing classifier better.
- More authoritative
results. We’ve tweaked a signal we use to surface more
authoritative content.
- Better HTML5 resource
caching for mobile. We’ve improved caching of different
components of the search results page, dramatically reducing latency in a
number of cases.
Friday, May 4, 2012
Google Algorithm Changes for April: Big List Released
As
expected, Google has finally released its big list of algorithm changes for
the month of April. It’s been an interesting month, to say the least, with not
only the Penguin update, but a couple of Panda updates sprinkled in. There’s
not a whole lot about either of those on this list, however, which is really a
testament to just how many things Google is always doing to change its
algorithm – signals (some of them, at least) which could help or hurt you in
other ways besides the hugely publicized updates.
We’ll
certainly be digging a bit more into some of these in forthcoming articles. At
a quick glance, I noticed a few more freshness-related tweaks. Google has also
expanded its index base by 15%, which is interesting. As far as Penguin goes,
Google does mention: “Keyword stuffing classifier improvement. [project
codename "Spam"] We have classifiers designed to detect when a
website is keyword stuffing. This change made the keyword stuffing classifier
better.”
Keyword
stuffing is against Google’s quality guidelines, and was one of the specific
things Matt Cutts mentioned in his announcement of the update.
Interestingly,
unlike previous lists, there is no mention of Panda whatsoever on this list,
though there were 2 known Panda data refreshes during April.
Here’s
the list in its entirety:
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment