13:57 PM

2016 Detroit Jazz Fest Moods Network

One of the great events of the summer in Detroit is the annual Detroit Jazz Festival, an epic event in the jazz world, with many of the world's foremost musicians convening in Detroit for an entirely free set of performances. It is in fact the world's largest free jazz festival, and may well be the best jazz weekend regardless of price.

For 2016, the festival plays host to the likes of the legendary bassist Ron Carter, Brad Mehldau, John Scofield, Randy Weston, Billy Harper, and a host of other musicians both international and local. So I thought it fitting to blend my love of jazz with my affection for network graphs, by using user tags from the All Music website. These tags are labels given to each musician based on listener perceptions of their work, and provide interesting information to use in building a graph.

The initial graph creation was done in Gephi, followed by deployment using sigma.js, which allows us use the web to probe and explore the graph to find interesting patterns in the data. The Force Atlas 2 algorithm was used to create the layout, with the nodes colored based on their modularity class, a form of clustering based on similar characteristics. When clustering works very well, nodes of the same color will stand apart from other color groups; in this instance, we are partially successful in this regard. We'll learn more about this shortly.

For those who want to interact with the network and draw your own conclusions, here you are:


Let's start with a view of the entire network:

Jazz Moods Network

Read More

14:45 PM

Visualizing CEPA Education Data, Part 1

In this post, we'll begin walking through the massive educational achievement dataset provided through the CEPA project at Stanford University (Sean F. Reardon, Demetra Kalogrides, Andrew Ho, Ben Shear, Kenneth Shores, Erin Fahle. (2016). Stanford Education Data Archive. http://purl.stanford.edu/db586ns4974). This archive provides a wealth of data on educational achievement across grade levels and academic years, and is supported with a vast array of socioeconomic indicators that can be used for deeper analysis.

Our initial steps to analyze and visualize the data will begin with Microsoft Excel for data prep (use your tool of choice) before moving on to Exploratory and Trelliscope for visual analysis of the data. Each of these powerful tools are based on R, the powerful open source statistical framework that will facilitate multiple analytical paths. Exploratory has its own GUI that uses many of R's most powerful analytic packages, while Trelliscope will be employed from within RStudio.

Here's a link to Exploratory:

and Trelliscope:

Read More

15:06 PM

Police Killings: A Fact-Based Analysis

In the wake of another unnecessary death of a black man at the hands of a white cop, and the equally gutless killings carried out by snipers at a peaceful demonstration, it's time to take a step back and look at some facts. Emotions are certainly running high on all sides of this issue, as one would expect. However, by their very nature, emotional responses invariably focus on symptoms, rather than on the root causes that create the longer-term issues. If we fail to look at the underlying facts, all we'll get are knee-jerk political responses that never address the real problem. This is the very nature of political activities; if voters were to examine the root causes we would find politics and politicians as the source of virtually all societal ills, including this especially inflammatory one.

Read More

20:45 PM

Who Finances the Candidates? Part 1

Revisiting a recurring theme, it's time to examine some more data from the Federal Election Commission (FEC), specifically revolving around the 2016 US presidential campaign cycle. Our goal is to shine a light on which political committees are donating to the campaigns of the various candidates, and to gain a better understanding of the dynamics of campaign finance. Using Gephi and Sigma.js as my platforms, I've built a highly interactive network to facilitate further exploration of the contribution patterns covering a period from January 2015 through February 2016. This type of network is commonly known as bipartite, wherein there are two main categories that connect to each other, but not to their own type. Here we will have committees and candidates connected, but not committee to committee or candidate to candidate.

In this piece, we'll view selected patterns within the network that I find of particular interest, leaving the rest for you the reader to explore further. This article's focus will be on the Democratic contenders, Hillary Clinton and Bernie Sanders.

Let's start with a snapshot of the entire network, with the candidates depicted in blue (Democrat) or red (Republican) shades:

full network

Read More

16:49 PM

Candidate Contribution Patterns

As the 2016 election season trudges inexorably toward a November climax, it might be instructive to learn more about all of the candidates, both those who have withdrawn as well as the remaining hopefuls. An interesting way to do this is to ignore all the debates, talking points, and public pronouncements, and instead focus on the campaign contribution patterns of each candidate. Using data from the Federal Election Commission (FEC) http://www.fec.gov/disclosurep/pnational.do, we can observe and analyze patterns within the contribution filings. This will enable deeper insight into who is funding the campaigns, at least at the visible, public level, if not the somewhat murkier world of political PACs and other organizational entities.

To provide insight into this data, we'll work with a candidate dashboard using Tableau Public. With this approach, not only can I begin to draw some conclusions about the candidates and their supporters, but others can also dive into the data and detect underlying patterns. In this article, I will first provide a link to this dashboard, allowing readers to investigate the data on their own, but we will then look at a variety of excerpts from the dashboard that should call out some of the important patterns in the data.

Read More

1 2 3