I’m just starting to have discussions with people on developing visualisations of scholarly digital footprints, and exploring the existing space to work with. Theres a huge range of systems and work in progress, and a lot of APIs to look at. This post is really just a collection of resources and ideas that could be important, categorised to make things a bit more coherent.
Potential Ideas for Representations:
Individual Network: A persons’ individual network of followers and how they interact with it is a second topic that of particular interest. In this case, there are obvious aspects like the number of followers, followed, tweets made etc that can be analysed and represented. This can, in a very basic way, identify those who are more prolific digital scholars using one service or another. Content analysis similar to that done by tweetpsychcould provide more detail as to what purpose the persons’ tweets have, while retweets or a proliferation of terms between a user and their followers could suggest forms of impact. Combining an individual perspective and that of a community (defined by shared followers, terms used in personal descriptions etc), it would be possible to present a person with a set of clusters representing the communities they appear to be a part of on Twitter or other services.
Communities:It would be ideal to get an idea of what kinds of ‘communities’ there are on digital networks, or what this concept might mean in terms of these systems. What has actually happened as I’ve considered this issue is that the question of what community is and why it matters has come to mind. A dictionary definition talks about locality, a group with a common interest or forming a sub-section of society, and finally, “sharing, participation, and fellowship”
It is possible to use followers, followed connections etc, and through this it would be possible to see the connections of two or more people, but this doesn’t necessarily identify a ‘community’ in the normal sense of the word: I might follow someone on Twitter in a very passive way, or me and a colleague might both follow guardiantech, or Stephen Fry, but it might be that such a connection doesn’t really qualify as being part of a ‘community’. In terms of the above definition, you could argue that there is a shared common interest, but unless there is two-way interaction, it does not fit with the concepts of sharing, participation and fellowship. So two-way connections linking multiple people (being followed and following and sharing users in the same way) is probably a better approach to start distinguishing people who may form a shared community.
In terms of shared interests, it is also possible to search for a particular term, and see those who mention it, either in their tweets or in their biographical description. It could be assumed that people who mention the open university in their personal description have some relationship with it, but this could be anything from joining as a student, to watching a tv show, to being a visiting academic or an employee. Its a start towards getting a data set of an existing community though.
Beyond using Twitter as an example, it would be possible to compare activity with other systems, perhaps comparing it to something like Yammer, which works as a closed network for an organisation like the OU. Of course, the systems have different functionality and constraints, but some interesting patterns could emerge -using Yammer certainly feels closer to the ideal of a community, as there is a pre-existing bond and shared interests as members of the OU, plus in many cases a proximity.
A Collection of Scholars’ Accounts: There is no reason why it wouldn’t be possible to collect a set of scholars’ accounts on services such as twitter, or use biographical information to identify them automatically. This could then be used for averaging or comparing tasks (e.g. as an individual scholar using Twitter, how do you compare or contrast to others or the average). It would be important to try and get a good range of scholars through some method, otherwise could just end up with scholars in our disciplines. Another issue is authentication, in some cases this gives us a great deal more data.
Existing Inspirations and related Blogs:
There is a mass of relevant systems and blog posts out there, including:
Tony Hirsts’ blog, particularly posts such as this one on network analysis of hashtags in Twitter and this one on Gephi visualisations and filters.
Rebecca Fergusons’ blog on learning analytics.
Data Sources and APIs:
Data Crunching and Visualisation Platforms:
Yahoo Pipes is a consistent feature in a lot of work to use data collected from the various networks.
Gephi is an application for producing visualisations from data sets, particularly for network analysis. Here is a useful page about the GDF format Gephi uses for node and edge data. Looks fairly straightforward to produce these files from code (e.g. see link in comment from Tony Hirst below). Also, there is the Gephi toolkit, a java library that could be used in combination with Twitter4j or other tools. The Java Universal Network / Graph Framework – JUNG – looks like a useful tool for putting interactive visualisations in to web browsers through an applet or servlet.
NodeXL is another system for this purpose that acts as a plug-in to MS Excel.