Watts, Duncan – The organizational spectroscope – Medium 20160401

Watts, Duncan – The organizational spectroscope – Medium 20160401

For several decades sociologists have speculated that the performance of firms and other organizations depends as much on the networks of information flow between employees as on the formal structure of the organization [1, 2].

This argument makes intuitive sense, but until recently it has been extremely difficult to test using data.

Historically, employee data has been collected mostly in the form of surveys, which are still the gold standard for assessing opinions, but reveal little about behavior such as who talks to whom. Surveys are also expensive and time consuming to conduct, hence they are unsuitable for frequent and comprehensive snapshots of the state of a large organization.

Thanks to the growing ubiquity of productivity software, however, this picture is beginning to change. Email logs, web-based calendars, and co-authorship of online documents all generate digital traces that can be used as proxies for social networks and their associated information flows. In turn, these network and activity data have the potential to shed new light on old questions about the performance of teams, divisions, and even entire organizations.

Recognizing this opportunity, my colleagues Jake Hofman, Christian Perez, Justin Rao, Amit Sharma, Hanna Wallach, and I — in collaboration with Office 365 and Microsoft’s HR Business Insights unit — have embarked on a long-term project: the Organizational Spectroscope.

The Organizational Spectroscope combines digital communication data, such as email metadata (e.g., time stamps and headers), with more traditional data sources, such as job titles, office locations, and employee satisfaction surveys. These data sources are combined only in ways that respect privacy and ethical considerations. We then use a variety of statistical modeling techniques to predict and explain outcomes of interest to employees, HR, and management.

Predicting team satisfaction

To illustrate the potential of these new data and methods, we analyzed the aggregate email activity patterns of teams of US-based Microsoft employees to predict their responses to an annual employee satisfaction survey. To protect individual employee privacy only email metadata was used (i.e., no content) and all identifiers were encrypted. Email activity and survey responses were aggregated to the manager level, where only managers with at least five direct reports were included, and only these aggregated results were analyzed. Our predictions therefore apply only to teams of employees who share the same manager, not to individuals.

We focused on three survey questions: did teams have confidence in the overall effectiveness of their managers, did they think that different groups across the company collaborated effectively, and were they satisfied with their own work — life balance?

We started by examining the data and found that that the vast majority of teams were pretty happy. Although this result is encouraging, as a practical matter HR managers are less interested in the large majority of happy teams than in identifying the small minority of unhappy teams. After all, it is the latter group on which HR needs to focus its resources. Rather than trying to predict the satisfaction level of every team, therefore, we focused on predicting just the teams in the bottom 15% — i.e., the least satisfied teams.

We considered two statistical models: first, a simple linear logistic regression model of the type that is widely used in quantitative social science; and second, a more complicated model from machine learning called a random forest [3]. Although random forests are generally less interpretable than standard regression models, making them less well suited to the explanation tasks typically found in the social sciences, they can capture nonlinearities and heterogeneous effects that linear models ignore and therefore often perform better at prediction tasks.

In our case the random forest performed much better: if it predicted that a team was in the bottom 15%, it was correct (across all three questions) between 80% and 93% of the time; in contrast, the linear model was correct at best 27% of the time. Critically, a ‘‘baseline’’ model that used only data on respondents’ position and level in the company — i.e., no email activity features — performed between 20 and 40 percentage points worse. In other words, the email activity data added large and significant value over and above the kind of data that HR managers already have (see Table 1).

Table 1. Precision of the random forest model (column 3) compared with a standard logistic model (column 1). Column 2 is for a random forest model that does not include email activity, but does include other features such as job category (engineer, product manager, sales, etc.) and level.

Table 1 also shows the particular features of email activity that were most predictive of low satisfaction. For work — life balance, it was the fraction of emails sent out of working hours: more is worse. For managerial satisfaction, it was manager response time: slower is worse. And for perceptions of company-wide collaboration, it was the size of the manager’s email network: smaller is worse.

At first glance these findings may seem unsurprising, but this reaction misses the point. To see why, consider work — life balance. Although it makes sense that sending an unusual volume of email outside of normal working hours would correspond to low satisfaction, it would have made equal sense that satisfaction was related to the overall volume of email sent or received, or to the relative distribution of email over days of the week. But none of these other factors were useful for predicting low satisfaction. The point, therefore, is not so much about finding results that are surprising and counterintuitive, but about ruling out all the plausible, intuitive explanations that are not in fact correct.

Another non-obvious finding is that different types of teams had different thresholds for what counted as a “bad” volume of out-of-hours email. The number of out-of-hours emails that predicted an unhappy sales team, for example, was different from that of an unhappy engineering team. Again, this result isn’t surprising (once you know it), but it would have been difficult to guess in advance. This result also highlights the advantages of using a complicated model over a simple one: although in general we believe that, all else equal, simple models are better, when effects are highly context-dependent, complex models can shine.

Lessons for managers and for science

Insights like these are of immediate interest to both employees and managers. In particular, because predictions based on email sending behavior can be made in real time, HR can obtain more timely feedback than surveys allow. Moreover, modern statistical modeling approaches such as ours can help managers in complex situations where many different factors could be at play — e.g., by showing which of many plausible explanations are supported by the evidence, and by cautioning against “one size fits all” solutions. Finally, employees could also benefit from tools that highlight help them quantify their work activity in the same way that personal fitness trackers help them quantify physical activity.

More generally, our results show how the combination of novel sources of digital data and modern machine learning methods — that is, computational social science — can yield insights that would not be available with traditional data sources and methods. Over time, we hope to expand this approach from the specific case of predicting team satisfaction to a much wider range of questions regarding teams, divisions, and even entire organizations.

Finally, it is worth emphasizing that deriving these kinds of insights requires a lot of care. To perform our analysis, we combined three datasets — email activity, the org chart, and poll results— that were collected in different ways at different times by different people. Joining these data sets in a manner that respected privacy and ethical concerns required significant effort and cooperation across teams, which in turn required us to clearly specify, and justify, our substantive research questions and goals. Likewise the realization that we needed to focus on only the least satisfied teams required us to think carefully about the structure of the data and about our research questions. For all the excitement about “big data,” in other words, computational social science works well only when powerful computation is matched with careful social science.

1. Burns, T. and G.M. Stalker, The management of innovation. 1961, London: Tavistock Publications. v, 269.
2. Lawrence, P.R. and J.W. Lorsch, Organization and environment; managing differentiation and integration. 1967, Boston: Division of Research Graduate School of Business Administration Harvard University. xv, 279.
3. Breiman, L., Random forests. Machine learning, 2001. 45(1): p. 5–32.

Complexity and Governance – 2013 Conference

Nanyang Technological University
July 15 – 19, 2013

Complexity Lens – 2015 Conference

Nanyang Technological University
July 9 – 10, 2015

Emerging Patterns – 2015 Conference

Nanyang Technological University
March 2 – 4, 2015

Robert Axelrod



Science of networks

… at least enough to get us started!

A network or graph is a set of items, which we call vertices, with connections between them, called edges. You may also see a vertex being called a site (physics), a node (computer science), or an actor (sociology). An edge may also called a bond (physics), a link (computer science), or a tie (sociology).

Basic network with 8 vertices and 10 edges

Studies of networks in the social sciences typically address issues of centrality (who is best connected to others or has the most influence) and connectivity (how individuals are linked to one another through the network).

A set of vertices joined by edges is the simplest type of network – there are many ways in which networks may be more complex. For instance, there may be more than one type of vertex in a network, or more than one type of edge.

In a social network of people, the vertices may represent men or women; graphs composed of two types of nodes, with edges running only between unlike types, are sometimes called bigraphs. So-called affiliation networks, in which people are joined together by common membership of groups take this form, the two types of vertices representing people and groups.

Edges may carry weights, representing (say) how well two people like each other; they may also be directed, pointing from one node to another. Directed edges are sometimes called arcs. Graphs composed of directed edges are sometimes called digraphs. Digraphs can be either cyclic – meaning they contain closed loops of edges – or acyclic – meaning they do not.

Four basic types of networks

(a) undirected network, single type of vertex & single type of edge; (b) network with different types of vertices and edges; (c) network with weighted vertices and edges; (d) directed network- from MEJ Newman (2003), The structure and function of complex networks.

One can also have hyperedges – edges that join more than two vertices together. Graphs containing such edges are called hypergraphs. Hyperedges (say) may be used to represent family ties in a social network – n individuals connected to each other by virtue of belonging to the same immediate family could be represented by an n-edge joining them.

Graphs may also evolve over time, with vertices or edges appearing or disappearing, or values defined on those vertices or edges changing.

A few other basic terms:

Degree: The number of edges connected to a vertex. Note that the degree is not necessarily equal to the number of vertices adjacent to a vertex, since there may be more than one edge between any two vertices. A digraph has both an in-degree and an out-degree for each vertex, which are the number of in-coming and out-going edges respectively.

Component: The component to which a vertex belongs is that set of vertices that can be reached from it by paths running along edges of the graphs. A digraph has both an in-component and an out-component, which are the sets of vertices from which the vertex can be reached and which can be reached from it, respectively.

Geodesic path: The shortest path through the network from one vertex to another. Note there may be and often is more than one geodesic path between two vertices.

Diameter: The diameter of a network is the length (in number of edges) of the longest geodesic path between any two vertices.

To be continued …