To investigate the impacts of air pollutants, specifically NO2 and PM2.5, an initial hypothesis was established. Extended scientific and medical research has strongly associated higher pollution concentrations with negative impacts on cognitive functioning and wellbeing, leading to the initial hypothesis: wellbeing is hindered by air pollution. A detailed literature review was conducted, providing insights into the range of variables to be investigated alongside essential contextual information, to potentially indicate a student's wellbeing.
Variables were selected from numerous large datasets to form a specific, smaller database covering the broad definition of the initial hypothesis. This included representation of disadvantaged and non-disadvantaged students (as defined by gov.uk), providing insights into averages and the spread of data, imperative for statistical analysis. The data was reviewed and cleansed, ensuring validity and consistency. A comprehensive analysis was conducted through Excel and QGIS, providing insights into the complex relationships between the selected wellbeing indicators and air pollutants, relative to WHO health guidelines, across London and also at the borough level.
Alongside database creation and analysis, the software was developed through numerous stages. To prepare the data for processing in Python, the data was further cleansed and geocoded enabling visualisation through an interactive scatter map with each point corresponding to a school. On selection of an institution a dashboard appears, containing the data variables selected from the database providing insights into contextual and wellbeing indicators for each school. The dashboard further displays data visualisations coded using the Plotly library and comparisons of a school’s variables to London and borough averages, in addition to WHO health guidelines, enabling individual school analysis. Throughout the project, rigours testing, debugging and cleansing occurred to ensure validity of data and functioning of scripts.
The flowchart below visually outlines this process: