20:06 PM

Visualizing CEPA Education Data, Part 2

A couple months back, I wrote about investigating CEPA academic achievement data (provided through the CEPA project at Stanford University (Sean F. Reardon, Demetra Kalogrides, Andrew Ho, Ben Shear, Kenneth Shores, Erin Fahle. (2016). Stanford Education Data Archive. http://purl.stanford.edu/db586ns4974). Finally, I've got part two, wherein I use Exploratory, the powerful R-based tool for data wrangling, analysis, and visualization. Exploratory enables folks like me who love the power of R but have not been immersed in the oftentimes complex world of R coding. The Exploratory front end makes using R a pleasure, as I hope this post will help illustrate.

Exploratory takes the essential R dataframe as a starting point, and then allows you to easily manipulate your data using a multitude of R packages included with the base Exploratory install. This post will walk through some simple analysis using CEPA data within Exploratory. We'll begin by setting the stage with a screenshot of the Exploratory workspace.

On the left side are the dataframes belonging to this project, while the center is dedicated to the primary workspace where all tables and charts will be viewed. Off to the right is where any actions are stored - filters, functions, and so on that act upon the dataframe. In this case, we're viewing the Summary tab for a particular dataframe. We can also look at a tabular view of the data by clicking the Table icon:

Finally, we have the option to see our data in visual form by selecting the Viz tab icon:

Read More

13:57 PM

2016 Detroit Jazz Fest Moods Network

One of the great events of the summer in Detroit is the annual Detroit Jazz Festival, an epic event in the jazz world, with many of the world's foremost musicians convening in Detroit for an entirely free set of performances. It is in fact the world's largest free jazz festival, and may well be the best jazz weekend regardless of price.

For 2016, the festival plays host to the likes of the legendary bassist Ron Carter, Brad Mehldau, John Scofield, Randy Weston, Billy Harper, and a host of other musicians both international and local. So I thought it fitting to blend my love of jazz with my affection for network graphs, by using user tags from the All Music website. These tags are labels given to each musician based on listener perceptions of their work, and provide interesting information to use in building a graph.

The initial graph creation was done in Gephi, followed by deployment using sigma.js, which allows us use the web to probe and explore the graph to find interesting patterns in the data. The Force Atlas 2 algorithm was used to create the layout, with the nodes colored based on their modularity class, a form of clustering based on similar characteristics. When clustering works very well, nodes of the same color will stand apart from other color groups; in this instance, we are partially successful in this regard. We'll learn more about this shortly.

For those who want to interact with the network and draw your own conclusions, here you are:


Let's start with a view of the entire network:

Jazz Moods Network

Read More

14:45 PM

Visualizing CEPA Education Data, Part 1

In this post, we'll begin walking through the massive educational achievement dataset provided through the CEPA project at Stanford University (Sean F. Reardon, Demetra Kalogrides, Andrew Ho, Ben Shear, Kenneth Shores, Erin Fahle. (2016). Stanford Education Data Archive. http://purl.stanford.edu/db586ns4974). This archive provides a wealth of data on educational achievement across grade levels and academic years, and is supported with a vast array of socioeconomic indicators that can be used for deeper analysis.

Our initial steps to analyze and visualize the data will begin with Microsoft Excel for data prep (use your tool of choice) before moving on to Exploratory and Trelliscope for visual analysis of the data. Each of these powerful tools are based on R, the powerful open source statistical framework that will facilitate multiple analytical paths. Exploratory has its own GUI that uses many of R's most powerful analytic packages, while Trelliscope will be employed from within RStudio.

Here's a link to Exploratory:

and Trelliscope:

Read More

15:06 PM

Police Killings: A Fact-Based Analysis

In the wake of another unnecessary death of a black man at the hands of a white cop, and the equally gutless killings carried out by snipers at a peaceful demonstration, it's time to take a step back and look at some facts. Emotions are certainly running high on all sides of this issue, as one would expect. However, by their very nature, emotional responses invariably focus on symptoms, rather than on the root causes that create the longer-term issues. If we fail to look at the underlying facts, all we'll get are knee-jerk political responses that never address the real problem. This is the very nature of political activities; if voters were to examine the root causes we would find politics and politicians as the source of virtually all societal ills, including this especially inflammatory one.

Read More

20:45 PM

Who Finances the Candidates? Part 1

Revisiting a recurring theme, it's time to examine some more data from the Federal Election Commission (FEC), specifically revolving around the 2016 US presidential campaign cycle. Our goal is to shine a light on which political committees are donating to the campaigns of the various candidates, and to gain a better understanding of the dynamics of campaign finance. Using Gephi and Sigma.js as my platforms, I've built a highly interactive network to facilitate further exploration of the contribution patterns covering a period from January 2015 through February 2016. This type of network is commonly known as bipartite, wherein there are two main categories that connect to each other, but not to their own type. Here we will have committees and candidates connected, but not committee to committee or candidate to candidate.

In this piece, we'll view selected patterns within the network that I find of particular interest, leaving the rest for you the reader to explore further. This article's focus will be on the Democratic contenders, Hillary Clinton and Bernie Sanders.

Let's start with a snapshot of the entire network, with the candidates depicted in blue (Democrat) or red (Republican) shades:

full network

Read More

1 2 3