Growing up in Mexico City, Andrea Vallebueno experienced the stark contrasts within neighborhoods like the western suburb of Santa Fe, where a concrete wall divides the wealthy from the low-income. As a student of economics and a data scientist researcher, Vallebueno became concerned about the way traditional economic data aggregates these extreme differences, masking the signs of inequality and urban decay. She recognized that, as more people move into cities around the world, this gap in granular data would become increasingly problematic.
Last year, to develop a new model for studying urban quality trends using machine learning and computer vision, Vallebueno collaborated with Yong Suk Lee, a Notre Dame assistant professor who at the time was an SK Center Fellow at Stanford’s Freeman Spogli Institute for International Studies. With funding through the HAI-Google Cloud Credit Grant Program, the two researchers set out to create an index of urban quality and change using Google Street View (GSV) images of cities over time. Their goal: to give city planners and policymakers more accurate, in-the-moment information about the living conditions within and across city neighborhoods.
Where Existing Data Falls Short
Accurate measurements of the quality of the urban space are essential to creating policies that address infrastructure and transportation improvements, poverty, and the health and safety of urban dwellers, as well as for understanding inequality within cities. However, current approaches to collecting socioeconomic data about crime rates, income levels, and housing conditions — and fielding occasional citizen surveys — are infrequent, expensive, and subject to human perception. They do not create an up-to-date picture at a neighborhood level.
In the past, researchers have experimented with using satellite imagery to quantify urban sprawl. More recently, machine learning projects have attempted to generate large-scale mappings of poverty, wealth, and income in developing countries. But, so far, the question of how the physical landscape within urban environments is changing over time has been largely overlooked in the machine learning community.
“There is no adequate measure that documents the quality of the urban space, its change over time, and the spatial inequality it presents,” Vallebueno says. “We believe we are the first to collect and make inferences on high-frequency GSV images and to construct panel data at the street segment level.”
Identifying Unsightly Urban Spaces
When she started thinking about this problem, Vallebueno went to San Francisco and spent a day on a hop-on, hop-off bus tour. Perched on the vehicle’s roof deck, she was able to take photos of the physical environment all over the city. Afterward, she spent several weeks labeling all the images by hand. “We really got to understand our dataset,” she recalls. “There’s no better way than to get your hands dirty.”
Next, she and Lee identified eight visual indicators of urban decay that could be detected in these street-level images:
- Barred or broken windows
- Discolored or dilapidated facades
- Utility markings
They trained the popular object detection algorithm YOLOv5 (You Only Look Once) to recognize these indicators using a dataset curated from those thousand pictures taken on the streets of San Francisco, combined with public detection datasets and existing GSV images.
The real breakthrough, Vallebueno explains, was in setting up an entire flow, from how to query the GSV imagery to training the object detection algorithm to processing its findings. “We had to create the pipeline and map out each of these steps to come up with a measure of urban decay,” she says.
The resulting index presents a picture of urban quality and change at a much more granular level than was possible before. A key feature of this approach is that the indicators can be considered holistically or individually. When policymakers need to understand the big picture, the model captures a single measurement of the physical urban space. But the model can also focus on a specific attribute, such as homeless encampments, to inform decisions about a specific policy matter.
Testing the Model in Three Use Cases
To assess the accuracy of this new measurement, the researchers chose three cities to apply their model. First, they looked at homelessness in the Tenderloin neighborhood of San Francisco from 2009 to 2021. Then, they set out to quantify the urban change associated with small-scale housing projects carried out in the 2017-19 period in a subset of Mexico City neighborhoods. Finally, they examined the performance of their model in South Bend, Indiana, a mid-sized city with more suburban and rural features, where they focused on distribution of the population and dynamics of change in its western neighborhoods.
In each context, the model’s findings closely matched the known historical trends. Vallebueno and Lee were encouraged to see that their trained YOLOv5 model could successfully detect the eight signs of urban decay that they had selected. And they believe this preliminary work highlights the potential of their approach to be scaled to track and compare patterns across entire cities, not just single neighborhoods.
“With street view image being increasingly available for many cities at higher frequencies, we believe this may be a powerful tool to provide insights into how city environments are changing,” they write in their working paper, which was discussed at the 15th North American Meeting of the Urban Economics Association in October 2021. “The index can be used to compare urban decay across geographies, to quantify the change in a space’s urban quality over time, and to make inferences about economic, demographic, and environmental variables.”
While the tool can potentially help city planners and policymakers design urban infrastructure and develop policies to address poverty and community health, Vallebueno acknowledges that it poses ethical considerations. One concern is the possibility of introducing selection bias in the street view imagery. If a neighborhood is so dangerous that it’s not captured by GSV cameras, then a key segment of information will be missing from that city’s dataset. Relatedly, GSV cameras are less likely to, and less frequently, capture images in less developed cities. For example, GSV cameras capture street images of San Francisco several times a year, whereas for South Bend images are captured sporadically across time and space. The unequal availability of images implies that cities will have different opportunities for policy innovation.
The researchers also were careful to annotate the dataset based on material objects present in the photo. “We did not want the model to inadvertently predict urban decay or homelessness based on the race or ethnicity of the people in the street view images,” Vallebueno says. “For example, we realized that faces and bodies need to be blurred and signs in other languages cannot be included in the model predictions.”
Even more worrisome — people could enact negative policies using the index to justify their agendas. Vallebueno and Lee agree the technology requires further ethical research, and they emphasize that the urban quality index should be used to supplement existing data and current processes, rather than replace them. They do not envision it as a decision-making algorithm that allocates resources by itself.
Going forward, Vallebueno wants to expand on the definition of urban quality explored in the initial paper to include positive physical attributes of a space, such as greenery and building aesthetics, rather than only considering the signs of decay.
“AI has the potential to guide urban planning decisions, but we learned it’s a lot of work to set up a pipeline with full integrity,” she says. “It’s not just a matter of training a model and spitting out predictions. You have to be mindful of the input data and output data, every step of the way.”
Stanford HAI's mission is to advance AI research, education, policy, and practice to improve the human condition. Learn more.