What is this? From this page you can use the Social Web links to save Let's guess the secret digg algorithm to a social bookmarking site, or the E-mail form to send a link via e-mail.

Social Web

E-mail

E-mail It
May 18, 2007

Let’s guess the secret digg algorithm

Posted in: digg

It’s becoming harder and harder to get the Digg homepage. Of course Digg designers don’t tell you much about this. At the beginning a story was promoted to the homepage thanks to a minimum of positive votes. But they quickly discovered that it was quite easy for a spammer to vote several times with different account or with different IPs. As I’m a curious person I wanted to guess how the promotion process works. So I’ll try to have a logical analysis based on examples.

1 - Let’s first gather some examples

At the time I’m writing this article, here are two stories that got “newly popular”:

Digg

Digg

As you can see, the first story got 38 diggs before being promoted with 11 comments in the category “PC Games”. The second one got 34 diggs and 9 comments in the category “Nintendo Wii”.

2 - The parameters that will get you to the frontpage

We know that many factors are to take into account to get to the homepage. It is not only based on how many diggs your story has collected. Here is what I think is important (I put them in priority order):

  • The number of votes you get (diggs) but more important the speed at which you get them. If a story is interesting then Digg designers suppose that it should be digged quickly. If you get 30 votes in 30 minutes you will probaly get to the home page whatever the over parameters. From this we can calculate the diggs/hour ratio.
  • The category in which you submitted the story. Obviously it will be harder to get to the homepage for a hot category where you have a lot more stories submitted. Some categories have a lot more visits than others so the story visibility is not the same. So the number of votes needed should adapted accordingly.
  • The “rank” of the users that voted for the story. The problem is now this rank is not exposed any more. This rank is probably based on the user activity on digg. Number of diggs, “Submitted to promoted” ratio, number of comments posted and their votes, date of subscription to the digg service, etc.
  • The number of comments that the story got before being promoted. Without comments, you can’t get to the homepage. But moreover it could be comments saying “this is a spam !” so the formula probably also take into account the comment votes. Of course if the sum of your votes is negative, you won’t get to the homepage.
  • The number of “buries” you get is important and there is probably a maximum of buries authorized to get to the homepage.
  • The time of the day (and the week) the story was submitted. Obvisouly a story submitted during the night (or when digg is at the minimum of visits) should need less votes than a story submitted when the site is overcrowded. This is what we can also categorize in the visibility of the story.
  • Some votes have a higher weight in the formula than others. Typically users that have a high “submitted to promoted” ratio will will have a vote with a bigger impact than a newbie vote. This parameter gives a sort of user quality ranking even if I think this would be highly criticizable. You can’t judge the quality of a user on this because we all know that great or interesting stories are not the only one that get to the homepage…
  • The number of friends of the submitter is also to take into account. We can easily imagine that when you have lots of friends you can ask them to digg your story so it’s easier for you than for a newbie.
  • The “submitted to promoted” ratio of the story submitter.

If you sum up the analysis based on examples in a table this is what we can obtain:

Digg analysis

Category #diggs before promotion Diggs/hour #comments on promotion S/P ratio of diggers Average Note Submission Time (UTC) SUbmitter S/P
PC Games 37 2.8 11 ? +7 7:20 PM 50%
Nintendo Wii 34 2.25 9 ? +41 9:20 PM 38%
Linux/Unix 43 2,04 7 ? +5 11:00 PM  

Something very difficult to get is the S/P ratio of users that digg the story. Because you have to know the date and time at which the users digg to filters the one that contributed to the promotion. We don’t want to analyze the diggs made after the story promotion.

Building such a table is hard work and needs a lot of patience so I ask maybe if someone would like to continue this work ? Other parameters can be added to the table but it needs close monitoring of what happens in digg. One method could be to concentrate on one category first and analyze the stories to quickly identify the one that is more likely to be promoted. Then for each digg you have to look at the profile of those who dugg the story, the comments, the votes…

I will then update and make the table evolves based on comments and contribution. What do you think ?

I wonder If someone has already think about designing a software that will automate these tasks. No in fact I’m sure this has been done. I know this is forbidden but this would give some very interesting infos on the digg algorithm. And we know that getting to the digg homepage can bring thousands of visitors to your website.

It’s like being in the google top results search for a keyword. Some knows and others don’t…


Return to: Let’s guess the secret digg algorithm