Jul 01 2022

Model for Predictive Policing

The premise of the show Person of Interest is that the government has a powerful computer algorithm connected to all data from the surveillance state – social media, phone calls, e-mail, etc., and it uses this information to predict crimes. This, of course, is a massive invasion of privacy, so to get around that the algorithm gives limited information, just the identity of a “person of interest” with no information about the crime or their role in the crime (even if they are victim or perpetrator). Traditional investigation then has to backfill the rest.

Is this sort of AI-based surveillance in our future? The technology is being developed for national security. As long as the targets are outside the US, then there are no legal issues with privacy. Government use of this technology to monitor US citizens is, of course, already a controversy, but that’s not what I am writing about today. AI is also used in what is called predictive policing – predicting “hotspots” of crime and then deploying police resources to those hotspots. This superficially makes sense, but (as a new study points out) is more complex than it may appear.

The problem stems from a common mistake, looking at one component of a complex system instead of the entire system. In this case the simplistic predictive policing model used crime data to predict future crime hotspots, but it failed to consider the broader social environment and also the impact of any policing response. If there is one thing we learned from social science over the last century is that we may produce unintended consequences, even the opposite effect of what we desire, if we react simplistically to situations and don’t consider human psychology.

The new study uses a “stochastic inference algorithm” in two ways. First, the traditional way, to predict crime hotspots over the next week. The “stochastic” term means that the data is random in detail but statistically predictable in the aggregate. So it cannot predict that a specific store will be robbed at a specific time, but it can predict that a 1,000 foot square area of Chicago will see a spike in crime over the next week. The researchers, however, then turned this technology around to look at how police departments respond to this data. They discovered racial bias in policing response.

This is a “painting the mountain green” problem (a reference to an incident in which China responded to a mountain brown with drought by painting it green). Obviously there are complex socioeconomic reasons why a particular patch if city might have higher crime. Also, taking a limited area approach does not consider networks of communication and transport. But the big problem here is that just changing allocation of policing resources does not address the underlying problem, and may make it worse.

What predictive policing allows departments to do is to increase the arrests per dollar – the numbers of arrests they make per unit of policing resources. In higher socioeconomic neighborhoods the cost per arrest are relatively high. Therefore, if police just follow this algorithm this will result in disproportionate arrests in lower socioeconomic neighborhoods, which also track with higher populations of minorities. This reinforces a hostile relationship with the police, unfairly targets minorities just for living in poor neighborhoods, and does not address any of the underlying social issues. Predictive policing in this way just becomes another method of racial profiling. But worse, it generates data which biases the model, creating a self-reinforcing system of bias.

Like many things, predictive policing is just a tool. What matters is how we use it. Some of the controversy surrounding predictive policing concerns the accuracy of the models and the biases they perpetuate and amplify. This is improving as the algorithms and data themselves improve. What remains is some careful thinking about how to best use the data we get from these algorithms.

The authors of the new study point out that the predicted hotspots should not be looked at in isolation, with a simplistic response of just patrolling hotspots more. Rather, it is one piece of data that has to be combined with other more traditional policing data, in addition to social data. Further, the output of these algorithms does not necessarily need to be used only for allocating policing resources, but can be used to allocate social support resources. Also, the algorithms can be used to monitor not just crimes but policing activity itself, to ferret out biases and misbehavior and monitor for effectiveness and efficiency.

When used well, these predictive policing systems may be able to reduce bias in policing, by identifying racial biases in police behavior and allocating resources more fairly. They can also reduce crime and improve security. They can be used to break bad patterns of policing, rather than becoming just another way to reinforce those patterns.

Even if implemented optimally, the final issue that will remain is privacy. This is a huge discussion that we need to have in our society – how much privacy are we willing to sacrifice in the name of security? We are rapidly entering a surveillance state situation. This may seem great if you are not on the receiving end, and if you generally trust your government. But this can change quickly. The recent SCOTUS decision abolishing Roe v Wade, for example, has suddenly created concern over how states that outlaw abortion will use social media and personal apps to track women who may be pregnant. Will they begin to monitor “pregnant women of interest”?


No responses yet