Tuesday 17 December 2013

Wikipedia - Socially constructed knowledge


Social networking turns up in all sorts of places.  For the Coursera course on Social network analysis I had to investigate a social network.  I chose Wikipedia because an article page includes links to other article pages.  Those links are a social construct; in theory anybody can edit any page so the links reflect a social view of the structure of the knowledge in the encyclopedia.

I used a Python program to do a breadth-first search of pages (based on one from the book Mining the Social Web), following links from a named page.  The resulting graph was displayed using Gephi

Social Network Analysis

Fig. 1 shows the graph obtained by following linked pages to a depth of 2 starting from the Social Network Analysis page.  Each node represents a Wikipedia page.  There is a root to this graph, the node
corresponding to the title 'Social Network Analysis'. This has been coloured in red. It has 49 edges. There is clustering around some topics and some nodes that link these clusters.

Fig 1 Graph of linked pages in Wikipedia starting from Social Network Analysis (in Red)

Most nodes had a degree of 1, but some had much higher degrees. The nodes with a higher degree than Social Network Analysis and the links between them have been selected in Fig. 2. There are 9 heavily linked nodes which are linked to each other. There is a common and surprising thread linking 5 of these: Computer Surveillance, Mass Surveillance, Terrorism, Espionage and National Security Agency (NSA). All 5 are concerned with aspects of security. It seems that when people contribute to Wikipedia about SNA they are linking to  pages about security. The page on the NSA is particularly notable, with the highest degree here of 251.
Fig 2 Wikipedia pages linked to Social Network Analysis page with largest number of links

 Art Deco Ceramics

Just to show how useful these techniques can be, I applied them to a very different topic.  I have an interest in a type of 1930's pottery produced in England at the Shelley Potteries.  So I ran the same process starting at the Shelley Potteries Wikipedia page and got Fig. 3.  This shows so clearly how Shelley pottery was influenced by two main movements: Art Nouveau and Art Deco.  

Fig. 3 Links from Shelley Potteries Wikipedia page

Conclusion

With relatively simple techniques it's possible to mine the structures built in to the Wikipedia links to reveal graphically the relationships between topics.