NodeXL is an open-source template that can be used in Excel sheets to generate a network graph. The free version, Basic is what will be used in the exercise below, but there is also a Pro version available that allows for more rows of data to be gathered from Twitter, and the possibility of importing from other social media platforms. We'll be first discussing its use to import data from Twitter below, as Twitter can be an interesting source for network analysis, but you can use it to graph data that you have otherwise gathered. It allows you to do a lot of customization in terms of the color and shape of the nodes, and additionally contains analysis tools that will group them based on the patterns of connection that the nodes exhibit. If you want to do a graph that requires more customization that Palladio offers, but it doesn't require quite the level of complexity that a program like Gephi offers, then NodeXL can help you create that graph.
A common use for network graphs is to analyze social networks, or how groups of people interact and are connected. Twitter is sometimes mined for these connections between people: who is retweeting who, who is using a given hashtag, and how those users are connected to each other, directly or indirectly.
With this exercise you'll find out how to pull data from Twitter on a previous viral trend "Laurel/Yanni", the audio clip where some hear a voice saying "Laurel" and others hear "Yanni". By importing Twitter data, you'll see how that trend is spreading, in a smaller section of Twitter at least, by searching for instances where those words were used alongside a tweet with a video linked to it. It's important to note that you aren't going to get the exact same data as me, due to the limitations of NodeXL's basic import functions of Twitter, but you'll learn the principles of how to import and visualize Twitter data so that you can use it for your own projects.
There are a bunch of different options visible to you from the menu in NodeXL about where you can import from, but some of them are only available to Pro users. In this instance you'll be using the import function from Twitter. It's crucial to note that there's a limit of about 2,000 tweets that you can gather at one time, and that Twitter's API has a limit that you can only query it about once every 15 minutes so you'll want to be careful about your search terms so you get the information you want on your first try.
You'll be using this import function to gather tweets containing the words laurel and yanni that link to a video in an attempt to see how that viral challenge spread.
What you get after importing the tweet is what amounts to a table. In order for this to become something that you'll be able to look at and analyze, you'll want to graph it, but as there are going to be a lot of individual tweets and users represented in this graph, you'll want to make sure that the different properties about those users and connections are indicated visually in your graph. With the below instructions you'll calculate which nodes are connected to the most other nodes (meaning they tweeted to the most or were retweeted by the most other users), and color them in accordingly.
Now that your graph will have some symbology indicating which nodes belong to which category, you can create a graph. You'll see at the top of the top of the sheet is a window called Document Actions which by default shows information about Node XL. This is where your graph will be displayed.
If it isn't very clear on the graph which tweets are most densely related to each other, you can use NodeXL's analysis mechanisms to figure out which node share commonalities in who they are connected to.
NodeXL is a relatively simple and free way to play around with analyzing Twitter networks.
NodeXL can be used for more than Twitter analysis. It's able to make graphs out of data you input manually and it allows you to do a lot of customization. In this lesson you'll create a NodeXL graph from previously created data about the interactions in the Shakespeare play A Midsummer Night's Dream. You'll customize the graph to display the information about the characters, the time scale and other information by using different colors and sizes for different categories.
If you've already created the A Midsummer Night's Dream dataset in the previous tutorial for Palladio, you can use it here, otherwise the link to that data is below.
The sheet that was created that contains the links, or connections between the different characters is what you'll be adding in on the Edges section of the sheet.
Since you've manually entered in the edges, the vertexes won't automatically transfer over the way it did when you imported your twitter network. You'll have to add them in, but fortunately, you've already got them in the Nodes sheet in your Midsummer Night's Dream data.
With this amount of information, you can actually already make a graph. At the top of your sheet is a Document Actions window that is currently displaying information about NodeXL.
With a well-made graph though, you'll want your viewer to be able to look at your graph and figure out what it is indicating without having to resort to looking at the accompanying table. You can assist in this by adding labels and colors that represent different information about the nodes and the links for them.
From this new graph you can see visually which nodes seem to have a lot of different connections to other nodes. But for the edges, since they are placed on top of each other, it's difficult to convey just how many connections go into each vertex. So whether Oberon is on stage with Puck in all acts or just once in the whole play, there is only one line present between them which can make visualization of the level of connection that a node might have more difficult. This is somewhat resolvable in the next step where you'll calculate each node's degree (how many connections are made to it), which is another thing you'll be able to visualize.
Not all of a graph's metrics can be calculated in the NodeXL basic program but one of the more basic and useful ones is - degree. A vertex (or node's) degree is determined by how many links are connected to it. You'll be using NodeXL's metrics function to calculate what the degrees are for each of your nodes.
If your graph looks like mine, the Royals group is rather snarled and clustered so it's difficult to read the labels for the different edges, and harder to tell how many acts any two characters appear in together. You can change this by manually moving different vertices.
NodeXL doesn't have any kind of function to allow you to create multiple parallel edges. In other words, whether Bottom and Oberon are onstage in five scenes together, or just one, there will be one connecting edge between them. Any attempt to color say, Act I edges as blue and Act III edges as red, will just mean that the edge will be red and that other blue edge will be hidden underneath it. If you want to display connections between nodes differently depending on some attribute, you'll need to create a series of different graphs. Fortunately you can do this rather easily using filters, just like you would use to filter any other Microsoft Excel document.
While you can save the spreadsheet you've created, and the settings that are used on the graph, if you want to use the graph as an illustration , you'll need to take a screenshot of it, since there isn't a way to save the graph as an image file.
If you want to see how my graph was composed, please see the below sheet which contains the graph, and the jpegs that contain the information for the different graphs for each act of the play.