7/4/17

JULY CECAN WORKSHOP: COMPLEX-IT and the SACS TOOLKIT: A Case-Based Computational Modeling Platform for Data Mining Complex Issues in Policy and Evaluation


 CECAN Training Workshop

 

 

 

COMPLEX-IT and the SACS TOOLKIT: A Case-Based Computational Modeling Platform for Data Mining Complex Issues in Policy and Evaluation


When: – Friday 7th July 2017 (1 day)

Location: – University of Surrey, Guildford, UK

Purpose: The complex socio-technical arenas (nexus issues) that government seeks to improve (e.g., health, food, water, safety, infrastructure) are not driven by a single factor or consequence.  Instead, they are driven by multiple factors at multiple levels, which lead to different trends or outcomes for different areas/groups of people.
The challenge is how to model such diversity and complexity?  The complexity sciences, data mining and big-data offer some useful solutions.  The challenge, however, is stitching these methodological solutions together into a user-friendly platform and APP, which policy makers, social scientists, evaluation commissioners and civil servants can use – hence our creation of COMPLEX-IT and the SACS TOOLKIT.

Intended audience:  This workshop is for anyone involved in evaluating the impact of policy (and its improvement) on complex nexus issues and would like to explore new software and mixed-methods options for doing so.

Level of prior knowledge of subject required:  For policy makers and evaluation commissioners, it is helpful to have a basic sense of statistics and an interest in data mining and the complexity sciences.  For researchers and methodologists, it is helpful to have an understanding of the latest developments in interdisciplinary mixed-methods, computational modeling and data-mining big-data.
Participants are strongly encouraged to bring to the workshop a policy issue or research concern (e.g., modeling multiple trajectories across time, dealing with large numbers of variables, etc) that they would like to use COMPLEX-IT and the SACSTOOLKIT to explore.

At the end of this course, participants will:

GOAL 1: Understand the theory behind case-based computational modeling, including
  • Having a basic sense of the principles guiding case-based complexity.
  • Understanding the philosophy behind data mining and computational modeling.
  • Developing a working knowledge of COMPLEX-IT APP and SACS TOOLKIT.
GOAL 2: Learn how to apply case-based computational modeling to their nexus topic, including how to:
  • Build a complex systems model of their nexus issue.
  • Explore how policy impacts different groups or areas across time/space.
  • Use this information to create your study’s case-based profile.
  • Identify major and minor case-based clusters and key causal factors.
  • Identify major and minor cluster trends (for longitudinal data).
  • Identify key global-temporal dynamics, such as spiraling sources and saddle points.
  • Use network analysis (where appropriate) to explore cluster links and structure.
  • Examine how different clusters and trends lead to different outcomes.
  • Run simulations to explore how policy can change outcomes.
  • Compare resulting model to original theoretical formulation.
GOAL 3: Learn how to use the COMPLEX-IT APP, including how to:
  • Download and install the software.
  • Run the software, including the R Studio environment in which it works.
  • Upload the case study database.
  • Identify key variables for case-based profiles.
  • Explore how to deal with missing data and errors in variable choice.
  • Use k-means cluster analysis to identify initial clusters, including how to identify. optimal solutions and run k-means for trend data.
  • Use the SOM neural net to corroborate clusters and identify possible sub-clusters.
  • Use SOM and k-means to identify underlying causal model.
----------------------------------
COMPLEX-IT
GitHub - Cschimpf/Complex-It: Complex-It Development

LINK TO R STUDIO
Click here to download R Studio

COMPLEX-IT AND AGENT-BASED MODELING
RNetLogo - ...and two worlds are yours
R Marries NetLogo: Introduction to the RNetLogo Package | Thiele | Journal of Statistical Software
GitHub - NetLogo/Mathematica-Link: allows Mathematica to control NetLogo (and not vice versa)
Georg-August-Universität Göttingen - Agent-based/individual-based simulation tools
CRAN - Package RNetLogo
CRAN - Package gafit
CRAN - Package GA
How to load the {rJava} package after the error "JAVA_HOME cannot be determined from the Registry" | R-statistics blog
Agent Based Models and RNetLogo | R-bloggers
Facilitating Parameter Estimation and Sensitivity Analysis of Agent-Based Models
NetLogo User Community Models: SegregationExtended
SELF-ORGANIZING MAP SUPPORTING MATERIALS:
Blog overview of SOMbrero R Package
Stochastic Gradient Descent
Wikipedia Article on Stochastic Gradient Descent
The art of running the SOM and choosing the map size
Discussion about setting the random seed for reproducible results
  
WALES DATABASES (INDEX OF MULTIPLE DEPRIVATION)
This is the main one we used for our Workshop.

Welsh Multiple Index of Deprivation

Another data website source.

Full dataset, which is a very useful case study for exploring how to use the COMPLEX-IT App
(The entire dataset is shown for all authority areas, across multiple indicators)

This provides multiple databases for multiple levels of analysis

This goes with the above, as it is their annual data release website for IMD

MAP OF WALES BY AUTHORITY AREA FOR DEPRIVATION

LSOA MAPS for Wales (Lower Super Output Areas, roughly N=1,500 people each) for a total of about 1,896 LOSAs in Wales http://gov.wales/docs/statistics/lsoamaps/lsoa.htm

NOTE: Area maps for each LSOA in Wales are available from the links below. LSOA is the geographic unit used in the Welsh Index of Multiple Deprivation (WIMD). LSOAs are built from groups of Output Areas (OAs) used for the 2001 Census. There are 1,896 LSOAs in Wales each with a population of about 1,500 people. Because the size and boundaries of LSOAs have not changed since they were created in 2004, the same areas are analysed in the three recent WIMD updates (WIMD 2005, WIMD 2008 and WIMD 2008: Child Index). The maps can be used alongside each of the three updates to identify the area covered by each LSOA.


BIOGRAPHICAL BACKGROUND:
Brian Castellani, Ph.D. is Professor of Sociology and Lead of the Complexity in Health and Infrastructure Group at Kent State University, as well as Adjunct Professor of Psychiatry, Northeast Ohio Medical University and co-editor of the Complexity in Social Science series, Routledge.  Trained as a sociologist, clinical psychologist and methodologist, Brian has spent the past ten years developing a new case-based data mining approach to modeling complex social systems, which he and his colleagues have used to help practitioners and policy makers address and improve complex public health issues such as community wellbeing, stress and coping (allostatic load), comorbid depression in primary care, addiction, medical education and grid reliability. Recently, Brian received a systems science scholarship from the Robert Wood Johnson Foundation to present at the 2016 AcademyHealth Conference – the leading organization in the States for health services researchers, policymakers, and health care practitioners and stakeholders. For more information, including publications on case-based complexity, see Brian’s website at www.personal.kent.edu/~bcastel3/

Corey Schimpf, Ph.D. is a Learning Analytics Scientist at the Concord Consortium, a not-for-profit company that develops curriculum and software for K-12 science, technology, engineering and math learning, just outside of Boston.  He received a Ph.D. in Engineering Education and a M.A. in Sociology from Purdue University and has several years of programming and software development experience. One avenue of Corey’s work focuses on the development and analysis of learning analytics that model students’ cognitive states or strategies from fine-grained computer-logged data from students participating in open-ended technology-centered science and engineering projects. I n another avenue of Corey’s work, he has been the lead or team member developing software to assist researchers dealing with complex, high dimensional problems and data-sets, such as an interface and infrastructure to integrate several methodological tools or a multi-purpose data processing tools for high volume data with limited structure.




6/16/17

GROWING INEQUALITY: Bridging Complex Systems, Population Health, and Health Disparities (A BOOK REVIEW)


A new book has been published by George Kaplan and colleagues, titled, appropriately enough, GROWING INEQUALITY: Bridging Complex Systems, Population Health, and Health Disparities

The edited book is the result of a handful of years that Kaplan and a working group of interdisciplinary colleagues spent struggling to figure out how to more effectively think about, model, and address growing health inequalities in the States, Canada and, by extension, the world.

The conclusion was that such issues are best viewed as complex systems problems and therefore best modeled in complex systems terms.  So far, so good.

The challenge, however, was how to proceed from there, which allowed the working group to happily descend into the chaos of real interdisciplinary work -- which is not easy by any stretch of the imagination.  Such work requires creating a shared vocabulary, embracing very different ways of thinking about research problems and their solution, and realizing that people often use the same scientific terms (such as non-linearity, for example) in rather different ways, and so forth.  Then there was the second issue of how to define, think about, model and manage this thing called 'complexity' or, more specifically, a 'complex system.'

On the first of these two issues the hard work by Kaplan and colleagues is to be, overall, commended.  In terms of the group's outcome, the book is organized into fourteen chapters -- with the acknowledgement and first and last chapters functioning as meta-reflections on the work the group did over its roughly five years of existence.  The second chapter constitutes a reflection of method, addressing the issue of complexity and multi-agent simulation.  The other twelve chapters focus on various public health issues, from improving health behaviors to the built environment to crime to health and socioeconomic well-being.

As concerns the second issue, however -- that is, how to define and understand and model complexity -- the book seems to fall short, overall, in several important ways:

1. To begin, the definition of complexity embraced remains narrow.  Mainly, while differences among the authors seemed to exist (as in Stange and colleague's chapter, which embraced a wider view) the group mainly employed what the French systems scholar, Edgar Morin calls restricted complexity -- which is basically conventional science, albeit now focused on complex systems.  As Morin states, "The problem with restricted complexity is that it still remains within the epistemology of classical science. When one searches for the 'laws of complexity,' one still attaches complexity as a kind of wagon behind the truth locomotive, that which produces laws. A hybrid was formed between the principles of traditional science and the advances towards its hereafter. Actually, one avoids the fundamental problem of complexity which is epistemological, cognitive, paradigmatic. To some extent, one recognizes complexity, but by decomplexifying it. In this way, the breach is opened, then one tries to clog it: the paradigm of classical science remains, only fissured."  (For more on this issue, see also Byrne and Callaghan's Complexity Theory and the Social Sciences or Jenks and Smith's Qualitative Complexity.)

Here is an easy way to think about the difference vis-a-vis the complexities of health: rather than think about the health vulnerabilities of poor people, we should think about the complex ways the systems in which they live make their poverty a vulnerability.  For example, in our recent book, Place and Health as Complex Systems, we examined how the poverty of the inner-city communities we examined had more to do with the suburban sprawl of affluent individuals moving into the suburbs (and all they take with them in terms of healthcare and its funding and access) than the poverty of the inner-city communities.  In other words, these inner-city communities are stuck in what complexity scientists call poverty (welfare) traps.

2. The second problem follows from the first: given their restricted definition of complexity, the working group primarily employs a reductive multi-agent simulation approach -- which struggles to deal with social structural issues and the macroscopic systems that serve poor and lower income individuals.  Nonetheless, some of the models (as in the chapter on crime and health) are very well done and do offer some useful insights.  Still, the focus, overall, is reductive microscopic modeling.

3. Also, related, as shown on the map of the complexity sciences, there is no wider usage made of complex network analysis, real-data-driven geospatial modeling, dynamical systems theory modeling, and various other computational modeling techniques.  Also, there is no critical discussion about what methods are useful and when -- for example, the chapter on simulation and big data does not address the significant criticism leveled at the issue. In contrast, as Byrne and Callaghan make clear in Complexity Theory and the Social Sciences, not all computational and complexity science methods are equal in their utility for social scientific inquiry.

For more on this issue, see Burrows and Savage's excellent article, After the crisis: Big Data and the methodological challenges of empirical sociology; as well as the SAGE journal, BIG DATA & SOCIETY.  On a positive note, however, Kaplan and colleagues do provide a very good overview in their concluding chapter of the need to ground simulation in real data -- which is a major argument of such journals as Journal for Artificial Societies and Social Simulation.   

4. Still, the final arguments Kaplan and colleagues make in their concluding chapter are not really new.  In fact, they have been argued extensively by others, none of which is really cited (See, for example, the Handbook of Systems and Complexity in Health) -- which leads to the final problem with the book: the insularity of its research and references.  Other than a few citations to the global field of the complexity sciences -- which has been rather highly involved in modeling health and healthcare in complex systems terms -- the references in the chapters were mostly limited to a small set of publications.  For example, in Chapter 2 on method, despite discussing the epistemological issues surrounding simulation, the only scholars cited are mostly those from the early days of systems modeling and cybernetics. The result is a somewhat biased and narrow view of the import of complexity for the fields of health and healthcare.


Still, despite these limitations, the main point of the book remains cutting-edge and clear: if we are to advance our ability to more effectively address the complex health inequalities that now exist on a global level -- and the myriad intersections they have with such global complexities as economy, politics, geography, ecology and culture -- it is imperative that public health scholars and the larger healthcare field (and those they serve) embrace a complex systems perspective.  Oh, and let us also not forget the importance of such an embrace of systems thinking by those civil servants, the world-over, who write the policies....  












    

5/25/17

Advancing Shannon Entropy for Studying Diversity Or, the Benefits of Case-Based Complexity!

We just published our latest article advancing Shannon Entropy for the study of diversity in complex systems, using our case-based entropy (also read as case-based complexity) approach.





Here is the abstract.  FOR THE ENTIRE ARTICLE, CLICK HERE



2/24/17

Addressing Complexity in Nexus Issues: A Case-Based Approach to Evaluation



I had the great opportunity to present the methodology on case-based complexity that my colleagues and I have been developing over the past several years to policy makers and researchers in London at the excellent new research and policy initiative, CECAN. 

The video and PDF below present a general audience introduction to the method as a whole.

1. Click here for the PDF of my POWERPOINT

2. Click here for the YOUTUBE VIDEO of my PRESENTATION

3. Click here for copies of the research papers we have published related to my presentation.