Zoom: How Much Information Is Changing the World
By: Mike • Essay • 1,691 Words • December 14, 2008 • 1,980 Views
Essay title: Zoom: How Much Information Is Changing the World
As many know that I work for Google, I came a lot of letters with strange questions or complaints about the policy Google, questions about how the Google made any belongings. Obviously, I can not answer questions about Google. And even if it could - would not have. This is not a blog Google - this is my personal blog, a hobby that I do in their free time.
But between my work and my hobby, there are crossing. And one of the ideas included in many of the incoming questions, as far as my work and my hobby. And this idea of the scale. Zoom - a concept on how to change things by becoming more. In particular, I am talking about the scale of information; amount of information that we are dealing with every day has increased significantly, and that surprising - the increase went unnoticed for most people.
What is the scale?
Take a certain amount of information and ponder - how it can be analyzed? Suppose you have a list of music tracks. I have a computer at their 4, 426. Now, suppose that I want to see them and draw up a list. If all you need to know - a simple information like song titles, artist, length, genre, etc., is quite simple. Any modern computer will do it for a few seconds. Hell, many modern phones do so for a few seconds.
Now suppose that instead of 4 000 tracks, you have to 4 million. This is not so much. I can analyze the information to 4 million tracks in different ways within a few minutes. For a modern computer is nothing.
But what if I want to do something pointeresnee - for example, look at the files and try to calculate the rate? Instead of 4 million sets of strings for calculating the pace should perelopatit full content of songs. In my music collection files took an average of a few megabytes, for simplicity take the size of one song in megabytes. Analyze 4000 songs on one Megabyte each - quite feasible task for a modern desktop PC, but it will take a decent time. Do the same with 4 million songs, you can not even try. I can buy a disk on which vlezet such information, but I can not get hold of their analysis for any priemlimoe time.
Music services such as LastFM, Pandora or MusicBrainz store information on tens of million tracks, and many of us use them every day not thinking about what is happening there. LastFM and Pandora analyzes millions of music tracks, so you can invite your songs, the most similar to what you're listening to now - based on the structure of the music file (not sure about Pandora, but it MusicBrainz analyzes the structure - approx. Interpreter). Ten years ago, do it on your computer, it was impossible. It is now a common phenomenon.
That is the scale - to change things with the increase in the size of tasks. The volume of data grow and grow, and sooner or later, the size of the tasks crossed the line beyond which a certain task is changing not only quantitatively but also qualitatively.
Economies of scale are sometimes run counter to intuition. In some cases, increasing the size of the problem, you make it easier. There are things which are easier to make a very array of data than with a small sample, and there are things easily achievable on a small amount of data, but not in large quantities.
Let's start with the first. The scale allows you to do some remarkable things previously were impossible. For example, you can write the entire genomes of large numbers of organisms. Last week, my friend blogger Tara Smith on the etiology wrote an article about a recent study of Ebola virus, doctors studying its spread, wanted to know what strain of virus they are dealing. They highlighted his DNA sequenced it, compared with the existing library of Ebola virus genome and identified it as a new strain of the virus, but the unfolding of a well-known strains. Now scientists are often sekveniruyut DNA viruses and compared with the already-known variations. In fact, they did much more, quote from one profile site:
«Due to explosive growth in genetics, a number of databases containing fully sequenced genomes of many viruses were created and available on the Internet. Using new genome data Ebola virus, the scientists compared it with other viruses. When decoded the genome and its parts were checked with the database revealed interesting details. Have been found matches between immunopodavlyayuschimi sequences in oncogenic retroviruses, mouse and the cat's leukemia viruses, and part of a new strain of Ebola. »
If you translate it into human language: they do not just compare the new variant of Ebola virus already known - it is not so difficult, because Ebola virus genome is only about 19 kilobytes. They compared it with all known viruses sequenced genomes, and found virtually identical sites in the virus and Ebola virus cat's