D1. A Jaccard Index Tool to Measure Similarity Between Boolean and Categorical Values in Polygons
Timothy Mulrooney
Description
Polygonal enumerations units within the confines of a Geographic Information System (GIS) such as counties, ZIP codes and census tracts encapsulate a wide array of data, ranging from population and crime rates stored as numbers to Boolean and categorial data. Boolean representations of the food environment (low income/low access vs. not low income/low access) developed by the USDA Food Access Atlas and categorial values (open water, developed, forest, etc.) denoted via the National Land Cover Dataset (NLCD) represent data extracted empirically, but converted to qualitative categories that cannot be compared to each other. In a survey of current geostatistical tools, many utilize quantitative attribute values and/or distance tied with location to derive descriptive, inferential, correlative and predictive statistics. There is a void in the space for all-in-one tools to compare two or more categorical or Boolean maps and measure their level of agreement. Comparisons can be done by adding new columns, calculating them and accumulating statistics which may be difficult for the lay GIS user. In this current work in progress, a Python-based tool with a simple Graphic User Interface (GUI) has been developed to allow users to calculate the Jaccard Index for multiple attributes for a polygon layer within the confines of desktop GIS software. This tool has been highlighted in a paper which calculates pairwise Jaccard Index calculations related to the assessment and evaluation of various food availability metrics as well as comparisons between NLCD and Cropland Data Layer (CDL) data. These results are promising and further work on the application of different use cases (type of Jaccard calculation, sample sizes, calculation of statistical significance, application to point data, etc.) as well as the education and dissemination of this tool is required to ensure it is not misapplied as this work can have distinct decision-making ramifications.
D2. Efficient Algorithms and Plugins for Spatial Analysis in Geographic Information Systems
Tsz Nam Chan, Tianhong Zhao, Wei Tu, Dingming Wu, Leong Hou U
Description
Geographic information systems (GIS) have been extensively used in various domains, including transportation science, criminology, urban planning, and epidemiology. Domain experts adopt spatial analysis tools in GIS, including kernel density visualization (KDV), inverse distance weighting (IDW), K-function, and Moran's I, to discover hidden patterns of location datasets. However, with the ever-growing location dataset sizes, all these tools are computationally expensive, which cannot efficiently (or even feasibly) support large-scale location datasets in the contemporary GIS (e.g., QGIS and ArcGIS). There are three main reasons. First, existing methods for supporting these tools normally take at least quadratic time complexities. Second, there is a lack of research studies for reducing the time complexities of these tools. Third, various fast algorithms have not been incorporated into GIS tools. To address these issues, we have proposed fast algorithms for handling some spatial analysis tools, including (1) KDV, (2) network KDV (NKDV), (3) spatiotemporal KDV (STKDV), (4) line density visualization (LDV), and (5) network K-function. These algorithms achieve the lower time complexities, enabling up to 1,000x speedups for million-scale datasets under a single CPU setting compared with existing methods. Moreover, we have already developed two QGIS plugins, called Fast Density Analysis and Fast Line Density Analysis (based on the algorithms for (1) to (3) and (4), respectively), which are online for use. For example, Fast Density Analysis can generate a 1280×960-resolution KDV in the New York traffic accident dataset (~1.5 million data points) with at most 40 seconds. In contrast, QGIS takes more than one day to generating this KDV. In the future, we plan to develop complexity-reduced algorithms and the corresponding plugins for other tools, including IDW, spatial/spatiotemporal K-function, and Moran's I. We anticipate that these plugins can replace the computationally expensive implementations in QGIS/ArcGIS so that domain experts can efficiently analyze large-scale location datasets.
D3. Bridging Healthcare Accessibility Gaps Through Generative AI and Two-Step Floating-Catchment (2SFCA) Methods: Leveraging Python Query Automation, Disparity Mapping, and Interactive ArcGIS Dashboards to Address Equity Disparities
Zach Sherman, Mengxi Zhang, Junghwan Kim
Description
This study investigates the integration of generative AI and geospatial analysis to address healthcare accessibility disparities, with a focus on underserved populations. Geographic barriers, such as limited transportation options and uneven distribution of healthcare resources, disproportionately affect rural communities, low-income populations, and Medicaid recipients. To address these challenges, a fine-tuned ChatGPT-4o-mini model was developed to convert natural language queries into executable Python code for geospatial tasks, including spatial joins, buffering, and driving time estimation. The fine-tuned model demonstrated significant improvements over a baseline model, achieving an 89.7% accuracy rate and reducing token usage by 3.7 times, showcasing efficiency gains critical for scalable applications. Additionally, the study utilized the two-step floating catchment area (2SFCA) method to compute accessibility scores for driving and public transit. Analysis of six regions in Virginia revealed that transit accessibility is consistently lower and less equitable, with sociodemographic variables such as poverty and non-white populations showing varied impacts across regions. Inequalities were more pronounced for Medicaid-accepting clinics, underscoring the need for targeted policy interventions and enhanced transportation networks. To bridge technical barriers, the research culminated in the development of an interactive ArcGIS dashboard. This dashboard allows non-technical users to interact with spatial data conversationally, visualizing accessibility scores and disparities in real-time. Despite advancements, challenges such as syntax errors and spatial reasoning limitations remain. Future work will expand geographic applicability, refine error-handling mechanisms, and incorporate additional transportation modes to enhance model robustness. This study demonstrates the transformative potential of generative AI in geospatial analysis by bridging critical gaps in accessibility and advancing data-driven decision-making in healthcare equity.
D4. iGEE: Mapping and Deriving Land Surface Temperature (LST) and Landcover (NDVI & NDBI) in Ho Chi Minh city - VietNam
Qian (Chayn) Sun
Description
iGEEHCM is an intuitive web-based tool designed to empower non-coding, non-GIS users in Ho Chi Minh City, Vietnam, by providing seamless access to urban environmental data. Built on the Google Earth Engine platform, iGEEHCM enables users to retrieve critical datasets such as Land Surface Temperature (LST), Normalized Difference Built-up Index (NDBI), and Normalized Difference Vegetation Index (NDVI) from high-resolution satellite imagery. The data is available at fine-grained administrative levels, including districts and communes, and can be downloaded as raster files for further analysis.
In developing countries like Vietnam, where rapid urbanization and climate change intensify environmental challenges, tools like iGEEHCM play a crucial role in climate resilience planning. Many cities face increasing heat stress, particularly in densely populated and socio-economically vulnerable areas. iGEEHCM integrates these environmental parameters with socio-economic and health-related indicators to construct a comprehensive Heat Vulnerability Index (HVI). This framework enables policymakers, urban planners, and researchers to identify high-risk areas, prioritize interventions, and design sustainable cooling strategies such as urban greening and heat-adaptive infrastructure.
Traditional climate monitoring and analysis often require technical expertise and significant financial resources, limiting accessibility in lower-income regions. iGEEHCM removes these barriers by providing an open-source, cloud-based platform that democratizes access to critical climate data. By simplifying geospatial analysis, the tool empowers local governments and communities to make evidence-based decisions without the need for advanced technical skills or expensive software.
As Vietnam and other developing nations work toward climate adaptation, iGEEHCM supports Sustainable Development Goal 11 (Sustainable Cities and Communities) by facilitating targeted actions that reduce heat vulnerability and improve urban resilience. The platform exemplifies how innovative, data-driven solutions can drive equitable climate action, ensuring that vulnerable populations are not left behind in the fight against climate change.
D5. The Irchel Geoparser: A modular opensource Python library for toponym recognition and resolution
Diego Gomes and Ross Purves
Description
Geographic analysis of unstructured text has seen rapid growth in recent years. This growth is motivated by the realisation that text often contains information related to location which can be extracted and analysed in application areas ranging from literary studies through analysis of spatial relationships to disaster response. Text sources vary from short messages typical of social media to novels, but all may contain explicit references to location. A key task in geographically analysing texts is the recognition and resolution to unique identifiers of these explicit references to place - toponyms.
This task is commonly referred to as geoparsing and, in the broader scope of natural language processing is an example of Named Entity Recognition and Linking. Although numerous studies have built systems addressing parts of this task, they typically have been designed as research studies and code is difficult to reuse, often lacking documentation and modularity.
In our demo we introduce the Irchel Geoparser. The Irchel Geoparser is published as a Python library under an MIT licence, and has been designed with modularity in mind. To support this we have created a set of accompanying tools and specifications, making it straightforward to apply the library to specific use-cases. The current implementation of the library uses spaCy for toponym recognition and fine-tuned Sentence Transformers for toponym resolution. An annotation tool allows the creation of gold standard data linked to a gazetteer for use in model fine-tuning and evaluation. The library has been used in a range of studies, for example to analyse twenty years of media articles related to wolves in French and German in Switzerland and to extract and analyse media bias in reporting of climate-related disasters globally. Current performance is comparable with the state of the art in terms of both quality and efficiency.
D6. A high-level visual modelling environment for parallel geospatial (raster) processing workflows
Alexander Herzig
Description
The demand for parallel processing capabilities has increased rapidly over the past 15 years or so, driven by ever higher spatial and temporal resolution of raster data, e.g. derived from LiDAR or satellite data. Meanwhile, GIS support multithreaded processing of selected algorithms, and libraries for high-level programming languages, such as 'Dask' for Python or 'parallel' for R, support the development of parallel processing workflows. However, while these libraries greatly facilitate the development of parallel geospatial processing workflows, they still require decent programming skills. Graphical GIS model builders, e.g. in ArcGIS or QGIS, currently do not support the development of scalable (task-) parallel workflows.
In this demo, we present the extension of the LUMASS visual programming environment for the development of parallel geospatial (raster) processing workflows that only requires the understanding of high-level programming and parallel workflow design principles. The workflows are scalable and run on shared-memory laptops and workstations as well as on distributed-memory compute clusters with thousands of cores. The framework supports pipeline, data, and task parallelism, and users can easily convert sequential to parallel workflows as long as input-data independence is ensured.
While the framework's core processing components are mainly focused on processing (multi-dimensional) raster data, arbitrary command-line tools, scripts, and programs can be seamlessly integrated into the workflow to augment its processing capabilities. This enables GIS analysts, scientists, professionals, and students with little to no programming skills to easily develop geospatial parallel (raster) processing workflows.