This is a continuation of my experiments with the PageRank algorithm for small network systems. The figures below are from the Google PageRank explorer web app, which is here:
The figures show the surprising impact of adding a single link to the system. Page
H - and even pages
C,
D, and
E - have their PageRank cut by about 30%, with this being taken up by pages
I and
J. And the only change is that there is a link added from page
I to page
J. And, even though we add a hyperlink
out of page I, its PageRank goes
up.
|
Adding Just One Hyperlink to the System - from I to J - Has an Impact on the "Distant" H, and increases the PageRank for I |
Why would the PageRank for page
I go up when we add a hyperlink
out of it? This is because prior to adding the link, page
I has no outlinks - it is a "dangling node" - and so it is forced to send an equal amount to every page in the system (the
webapp above shows the details on all of the matrices involved). When it has the link to page
J in the bottom network, it sends much less to the rest of the cells in the system, as it sends stuff mostly to page
J, which in turn is passing it right back to
I. In conjunction with this is the fact that there is a only constant amount of PageRank to go around, and so if one page is getting more, then this is reducing PageRank for other pages. Results can certainly be counterintuitive at first glance. And, in thinking about it, the very smallness of these networks may lead to results not seen when there are billions of pages. Only Google knows for sure.
Please feel free to let me know if you find otherwise with the examples above.
No comments:
Post a Comment