Hey there, thanks for dropping by my blog.
My name is Noah. I’m currently a final year undergraduate student, studying BSc Geography with Quantitative Research Methods at the University of Bristol.
I’m very excited to be a participant in this year’s Google Summer of Code (GSoC); working on a project for the good folk at Python Spatial Analysis Library (PySAL), under the umbrella of NumFOCUS.
This is the first instalment of the blog that will accompany my journey.
[whispers in disclaimer] I’m going to try to keep my blog posts on this site as concise as possible. I have minimal experience of blogging, mainly consisting of a selection of long-winded ramblings about a 3-month Water and Sanitation Hygiene (WaSH) project that I was a part of in northern Nicaragua back in 2018 (for anyone interested, check out this link: https://noahbouchier.wordpress.com/). Creating those posts really helped me through that adventure, but they definitely went on a bit. This blog will document my journey, and is as much for myself, NumFOCUS, PySAL and (maybe) Google to track my journey. Therefore I may ramble or explain things in rather long-winded detail. Please know that I am doing this more for myself than anyone else.
What is Google Summer of Code (GSoC)?
Well I’m glad you asked! This summer is the 17th year of GSoC - with its longevity a testament to its fantastic premise. As quite such a central tech giant, Google facilitates the undertaking of whole range of open-source coding projects with an ever increasingly-diverse set of organisations.
This year I am one of 1,292 students to be selected by 199 different organisations. As part of the GSoC structure, every student is assigned at least one mentor from their parent organisation and will receive a uniform stipend for their work (subject to two evaluations from their parent organisation).
Want to know more? Let me point you in the direction of the GSoC website along with the GSoC student guide, which offers a much more flashy explanation of GSoC, its history and how to find out more about it.
Who am I working for?
As stated above, the organisation I’m completing a project for is the Python Spatial Analysis Library (PySAL). Though formally this is all made possible under the umbrella organisation of NumFOCUS.
You might have come across Python. When it’s not covered in scales, it is a high-level, general-purpose programming language. Crudely put - this means that it is a way of telling a computer to do lots and lots of different things.
Most relevant for me is Python’s ability to read in many different formats of data, and then analyse and manipulate this data to learn useful things. As you may have deconstructed from their name, PySAL focus around the analysis and manipulation of spatial data - that is data that relates to some form of space. The team create and maintain many tools to make it easier for people to utilise Python to analyse and manipulate spatial data.
As a quantitative geographer, analysis of spatial data is one of my primary passions. It’s my bread and butter. And as a quantitative geographer in Bristol, I have been fortunate to receive extensive, practical training on programming-orientated analysis of spatial data over the past 3 years. While I have been skilled in the R programming language, this summer I’m looking forward to stretching and developing my skillset to apply my transferrable skills for use within Python.
My mentors
As I said, every GSoC student is assigned at least one mentor from their parent organisation. I’m fortunate enough to have 2 - as well as being integrated into the wider PySAL community!! My mentors are:
- Levi Wolf - website - Levi is a Senior Lecturer in Quantitative Human Geography at the University of Bristol. He’s been my personal tutor for the past 3 years. He has been such a fantastic support and teacher, expanding my horizons to many fascinating opportunities - including GSoC!!
- Jeff Sauer - website - Jeff is currently a PhD student at the University of Maryland, focusing his research on developing a geospatial understanding of the U.S. Opioid Overdose Epidemic (OOE). During the summer of 2020, Jeff was a GSoC student within PySAL and has since continued his involvement in the community. I’m looking forward to getting to know him more, and benefitting from his firsthand experience.
Fellow students
I’m very excited to be also connecting with the wider PySAL community, including many fantastic, accomplished academics who I respect and admire greatly. There are two other GSoC students also part of the ‘21 PySAL cohort - Lenka and Germano. You can find more about what they’re up to here:
What am I doing?
The project I am undertaking for PySAL is centred around creating a new tool to explore segregation within a city.
In 2019, Madalina Olteanu, Julien Randon-Furling, and William A. V. Clark wrote an article entitled “Segregation through the multiscalar lens”, introducing a new method of measuring segregation. As the title suggests, this developed the field through its consideration of the segregation at multiple scales. The method creates an understanding of how various characteristics of a population change across the spatial scale of a city. It considers how representative different populations are to that over the overall populations, starting at the smallest scale and zooming out.
If you live in a building of flats, this method will first consider how representative the population of your flat is to the population of the whole city, then gradually increasing the spatial scale of analysis, until it reaches the whole city.
Example - City 123
I like to consider it like this:
Flat | Building | City 123 | ||
---|---|---|---|---|
Ethnicity A | 50% | 56% | 60% | |
Ethnicity B | 25% | 17% | 15% | |
Ethnicity C | 25% | 27% | 20% | |
Other | 0% | 0% | 5% |
If you live in a building of flats in a hypothetical city, called “City 123”, you use this method to understand the segregation between different ethnic groups across the city, radiating from your flat. Lets say the ethnic composition of the city is 60% ethnicity A, 15% ethnicity B, 20% ethnicity C and 5% other. The method would start by understanding the ethnic composition of your flat - which is 50% ethnicity A, 25% ethnicity B and 25% ethnicity C. It would compare this composition to that of City 123, seeing that there is higher proportions of ethnicity B and C, with less of ethnicity A and no other. The method then zooms out slightly, to consider the ethnic composition of your whole building of flats, which could an ethnic composition of 56% ethnicity A, 17% ethnicity B and 27% ethnicity C. Then comparing that composition to that of City 123, and noting the disparities. It may then zoom out to consider the ethnic composition of all the buildings within a 100m radius of your building; then comparing that composition of that whole area to that of City 123. Then it might move on to a 200m radius, then a 300m radius, etcetera until it was surveying the composition of the whole of City 123 - which of course has the same composition as City 123 (ie. there will be no disparities).
This collection of measured disparities of a characteristic, at certain scales, from the overall population forms the base of the methodology put forward by Olteanu et al. (2019). From this, trajectories of convergence can be used to capture and unpick the rates of change throughout the population. In understanding the composition of urban centres in a more in-depth way, this helps to develop research that considers how individuals mix throughout the city; which is a way by which residential segregation (in relation to where you live) can influence individual’s everyday experience.
For example, if I live in a building of flats in the west of City 123, but my friend lives in a very comparable building of flats in the east of City 123, upon carrying out the same method above (starting at the flat-level, then building-level, 100m radius etc.), we would see different trajectories and rates of change in the composition of populations. For research of ethnicities, this could conceivably impact the way that someone experiences the city; the communities they are exposed to and how that shapes the fabric of their urban environment. Needless to say I think it’s a very interesting and pertinent topic!
How is it going?
I actually write this blog post on the 3rd week of my Google Summer of Code experience.
So far, it has been a steady start of coming to grips with Python - I’ve primarily used R throughout my degree training, with minimal previous exposure to Python.
I had the chance to join one of the PySAL developer meetings too, and meet more of the community there. It was a great bit of exposure to the way that they interact and help to maintain and grow their library of packages.
I’ve also been spending time getting this website and blog built and running, as well as developing my understanding of GitHub, where I’ll be storing all of the work that I’ve been cracking on with in the project. You can find the link to my GitHub on the website home page.
This week, I’m spending time working on migrating some pre-existing code, working towards the implementation of the Olteanu et al. (2019) segregation method. The conversation surrounding this paper has been going on for a long while, and can be seen on this discussion page of the PySAL segregation discussion forum. My GitHub should show the fruits of this labour.
In doing this, I’m already gaining a much greater understanding of the inner-workings of this methodology, and how it translates to practical instructions for a computer to process.
Tracks of the blog
One of my favourite aspects of my first experience blogging, was including the songs I was listening to most as I was writing each post. I love music, and including specific tracks connected me to where I was at during that short space of time. I’ll do my best for this not to turn into a music blog, my taste tends to pull from a quite an eclectic range. If you’d like to see what I’m collating in more detail, then my Spotify is linked on the website home page!
Thanks so much for reading!
Noah