Episode 80: Thomas Dee

Thomas Dee

Thomas Dee is the Barnett Family Professor at Stanford University’s Graduate School of Education.

Date: October 11, 2022

A transcript of this episode is available here.

Episode Details:

In this episode, we discuss Prof. Dee's work on dispatching health workers instead of police to some 911 calls:

“A Community Response Approach to Mental Health and Substance Abuse Crises Reduced Crime” by Thomas S. Dee and Jaymes Pyne.

OTHER RESEARCH WE DISCUSS IN THIS EPISODE:

“Variation Across Police Response Models for Handling Encounters with People with Mental Illnesses: A Systematic Review and Meta-analysis” by Chunghyeon Seo, Bitna Kim, and Nathan E. Kruis.
“Crisis Averted? The Effect of Crisis Intervention Units on Arrests and Use of Force” by Maya Mikdash and Chelsea Strickland. [Draft available from authors upon request]

TRANSCRIPT OF THIS EPISODE:

Jen [00:00:08] Hello and welcome to Probable Causation, a show about law, economics and crime. I'm your host, Jennifer Doleac of Texas A&M University, where I'm an economics professor and the director of the Justice Tech Lab. My guest this week is Thomas Dee. Tom is an economist and the Barnett family professor at Stanford University. Tom, welcome to the show.

Tom [00:00:28] It's great to be here. Thanks for having me.

Jen [00:00:30] Today, we're going to talk about your research on dispatching health care responders instead of police to some emergency calls, but before we get into the paper, could you tell us about your research expertize and how you became interested in this topic?

Tom [00:00:44] Sure. Well, as you mentioned, I trained as an economist, but I've always had really broad, interdisciplinary and very policy focus interests in particular, like you and many of our peers. I have a particular interest in using the modern tools of causal interests to study important and innovative programs and policies. And, and the broader context for that that might interest your listeners as what we economists call the credibility revolution, which has been a movement over the past three decades that was recently recognized with the Nobel Prize to develop and use quantitative methods that build really credible evidence about the causal effects of behaviors and programs and policies. Now, in terms of my interest in this particular topic, well, you know, I'm pretty broad, as I mentioned, in terms of the types of policies that might interest me. And what I really look for is an intersection of an innovative and important policy, but coupled with the right kind of data and the right kind of design possibilities to say something really credible about its impact and that often leads me to rely on some serendipity and that was very much the case here.

Tom [00:02:04] One of the hats I wear here at Stanford is I'm the faculty director at the John Gardner Center for Youth and their and the Gardner Center is this really fascinating group of researchers who work in partnership with policymakers and practitioners to guide programmatic improvement. And through my engagement with the Gardner Center, I've gotten connected with a number of Bay Area police agencies that were interested in innovative ways to deal with mental health emergency calls or more broadly, behavioral health calls involving substance abuse and things of that sort. And so we're in the process of working with police agencies on on innovative practices that involve having a health specialists ride along with police officers on targeted calls and that's what's called a co-response model, but through my awareness of that kind of slow moving partnership research, my engagement with that, I've become aware of like the broader set of policies that many communities are adopting. And again, serendipity played a role I saw some news coverage of this really interesting pilot program that Denver had implemented and with a little sleuthing with my coauthor, Dr. James Pine, who also works at the Gardner Center we also uncovered a rich set of open data resources that put us in a fabulous position to study that program.

Jen [00:03:31] Your paper is titled "A Community Response Approach to Mental Health and Substance Abuse Crises A Reduced Crime." As you said, it coauthored with James Pine and was recently published in Science Advances. Congrats on that. And in it, you consider an innovative program in Denver, Colorado, to change who responds to local emergency events. So let's start with some context. What was the problem that Denver was trying to solve?

Tom [00:03:56] Yeah, there really is an interesting history to this.

Tom [00:03:59] You know, the program we studied actually is part of a larger initiative called Caring for Denver that appeared on a ballot initiative in 2018 and roughly 70% of voters approved a sales tax increase of about a quarter of a percent to fund a variety of relevant services for people with mental health or substance abuse needs and the Denver STAR program, which we studied, was one of those. Also, I think another important piece of context is that around this time, the police were reporting that they were seeing an increasing number of service calls related to mental health issues. So I think that context was really important not just for the development of this program, but there was a way in which it seemed to develop that was very collaborative across different constituencies police, mental health communities, paramedics, etc.. That, I think is something I'd like to revisit as an important part of its success, but it began with that broader mental health focus and interestingly, with financing. Again, a strong community affirmation through the ballot.

Jen [00:05:10] And so Denver created the Support Team Assistance Response or STAR program. So what does this program involve? Where the moving parts here?

Tom [00:05:19] Well, the business as usual before STAR was, if there were some type of behavioral health crisis, perhaps you had someone trespassing in a store or someone having issues with substance abuse. There would be a 911 call that would be directed to one of two places, either to the police or to a health hospital kind of response that would vast majority of the times end up taking someone to the emergency room, but what STAR did was it provided a third pass for targeted calls. In particular STAR it involves a two person response team that consists of a paramedic and a mental health crisis interventionist who could respond to targeted calls and coordinate services for people in need that and I think part of the goal was to reduce the use of of inappropriately directing people in those circumstances, either to jail or to the emergency room. But at the core of it is this two person team going out on targeted calls. Again, a paramedic and a mental health crisis, interventionists in a van with some basic materials, blankets, water, things of that sort.

Jen [00:06:32] And as you said, this is something police departments across the country been thinking a lot about. And so there are a variety of programs folks are trying now, which is cool to see. So how is STAR similar to or different from the other approaches that are out there and becoming increasingly common, I guess?

Tom [00:06:49] Yeah, I think it's distinctive and really interesting and important ways. I mean, I think the modal approach right now is what's referred to as crisis intervention, team training and approaches. And, you know, CIT really focuses on training police officers and dispatchers to identify and manage service calls involving behavioral health crises. Now, you know, they try to point out that they want to have a big community response element to it, but I think it's fair to say it's really focus on try on police training. So that's one approach that's very widely used. We're starting to see municipalities describe what we're studying here in the Bay Area at the Gardner Center, so called co response models, where a responding police officer will be paired with some type of behavioral health specialists on targeted calls. STAR is a third option that in many ways this is much more dramatic. I don't want to say radical because I think that's probably loaded with meaning the wrong type, but a more aggressive approach that involves forgoing police involvement altogether and instead directing just something like that two person team I described towards those targeted calls.

Tom [00:08:05] And it's an interesting bit of history here. I mean, we're starting to see these community responder programs pop up in various municipalities, but the seminal program was introduced roughly 30 years ago in Eugene, Oregon. It's called cahoots. So it's interesting how an innovation like that might exist in a fairly distant past and not necessarily be taken up broadly, but then in the right context. It seems to be animating a lot of communities right now who are concerned about, you know, mental health issues.

Jen [00:08:37] Yeah. Do you know anything about how the police department in Denver responded to this? Was this something they were generally supportive of or was there any tension?

Tom [00:08:46] Yeah, my sense is they were really supportive, and I think that's a vitally important implementation detail that I was I was mentioning briefly earlier. It's I think sometimes we think of these different options, community responder programs like STAR, co-response, crisis intervention team training as plug and play reforms, you can pull off the shelf, but they might interact in ways that are really important. So, for example, one of the things I've always felt about the success we found with Project STAR is it might have benefited from the fact that the Denver police had really taken up crisis intervention team training. And if you look into some of the operating details of Denver STAR, I mean, what we saw was that a sizable minority of the calls weren't through dispatch. It wasn't the dispatcher saying, oh, this is something for the STAR team. It was the police showing up on a field call and saying, you know what, I really need the STAR team here to deal with this effectively. And so that's something that might have benefited from the crisis intervention team training.

Tom [00:09:50] And more broadly, what I talk about, the positive impacts we found for the Denver STAR Pilot, I like to stress as compelling as that is, the successful scale up of it and the successful replication of it elsewhere are not something to be expected. Right. Because, again, you need the police to buy in so that they can coordinate effectively. You need the right kind of dispatcher training. You need the coordination with the mental health practitioners in the community. And descriptively, it appears as if Denver got that right. But whether other communities can replicate that success or whether Denver has been able to replicate it at scale are open empirical questions that Dr. Pine and I very much want to study.

Jen [00:10:38] Very interesting. Okay. So so we have all these different models that are out there in addition to perhaps in parallel with the STAR program. What have we previously known about the effects of these types of programs here and elsewhere?

Tom [00:10:53] Yeah, this is fascinating to me because, you know, there has been policy innovation, particularly around crisis intervention teams and co-response models, and there have been some systematic reviews and even meta analysis, you know, synthesizing multiple studies to understand what the evidence says. And generally, those reviews, which are referenced in our research article that's available on my website without a paywall if anyone wants to see it, those reviews suggest the programs generally seem effective. However, if you look at the research designs employed in those studies, they tend to be in the spirit of case notes or qualitative descriptions or descriptive studies, or very simple pre-post comparisons or cross-sectional comparisons. And that type of evidence that is more descriptive and correlational is useful, but it doesn't really beat credible standards for causal inference.

Tom [00:11:53] And in fact, one of the reviews really called this out. I'll quote from one of the reviews here that said, "We caution against drawing conclusions related to causality based on these findings." So it's surprising to me that other studies have had yet that that standard and we think our study is the first to do that. But there's a lot of opportunity to build the kind of evidence base we want going forward.

Jen [00:12:18] Indeed a huge research frontier here. I will note just because it's a pet peeve of mine, these meta analysis, I appreciate that those authors noted that we need more good research, but they still did a meta analysis of all of their descriptive studies, which maybe is not the best use of.

Tom [00:12:36] Yeah, there could be a I mean, not to speak too dismissively or brusquely, but a garbage in garbage out phenomenon there.

Jen [00:12:45] That's right.

Tom [00:12:46] Yes. So I'm sympathetic to that concern. And just more more generally to the the need for for better evidence here. But given that this is what I do, I'm I'm excited by the opportunity there.

Jen [00:12:58] That's right. More work for us. Okay. So let's get to why this is so challenging, what makes this such a difficult topic to study? Why haven't all these studies been written already?

Tom [00:13:09] Well, to my mind, I mean, to provide high quality evidence on on any policy, really, we need two critical ingredients. I mean, one is the ready availability of detailed, high quality data. And in this context, that would be granular data on relevant outcomes, calls to 911 service responses and reaction to those calls, arrests, maybe even other downstream judicial or health outcomes, as well as detailed information on the policies that communities have adopted and the character of their implementation.

Tom [00:13:44] Even in this information saturated age, that information is hard to come by. And again, we were fortunate Denver was in a really felicitous in this regard because they maintained an open data infrastructure that allowed us to get into this. The second and key ingredient, though, even if you had all that data that might be on your wish lists, is you need a policy rollout which either by a designed intention or by, you know, a blind luck, makes it possible to identify the causal effects in a credible manner.

Jen [00:14:20] Yes. You need something that gives you a control group.

Tom [00:14:23] Exactly. We need that comparison group.

Jen [00:14:25] Yeah. And so luckily you have it in Denver. So you're going to use the staggered rollout of the STAR program, along with the program's eligibility criteria as a natural experiment. So let's step through those details. What types of calls are eligible for STAR services?

Tom [00:14:43] Yeah, this is important. So there were specific 911 codes where if dispatchers flag them in a certain way, they were approved for STAR services that involved things like public intoxication, suicidal person, a welfare check or a call for assistance, indecent exposure or trespassing, etc. So those kinds of calls would trigger STAR eligibility and the dispatch of a ticket. Now, those codes don't always correspond to the arrest data, so what we had to do in our analysis was have raters identify the offense codes in the data that were plausibly closely related to star services, things like disturbing the peace, drug possession. So those sorts of coded offenses were our focal outcomes.

Jen [00:15:33] Okay. And then how was STAR implemented? Where and when was STAR available for these calls?

Tom [00:15:38] Yeah, this is another critical detail. There was a six month pilot period that began on June 1st in 2020.

Tom [00:15:45] Now Denver has 36 police precincts. They intentionally chose eight precincts and these were all downtown precincts, precincts that were geographically contiguous for STAR services. So that's what gave us one type of comparison, having all the other Denver precincts where STAR was unavailable as a type of comparison condition. Now, it's also interesting to note, I'm not quite sure why they did this. The services were only available Monday through Friday, 10 a.m. to 6 p.m., which wouldn't accord with my priors about when you might want this service available.

Jen [00:16:25] Yeah.

Tom [00:16:25] But that's what they did. And that created another window of insight for us in this work.

Jen [00:16:31] Yeah, I have some students who are studying a related program in Texas and they are using sort of similar variation there. Right. It's like the program is available for these mental health calls from 8 a.m. until 5 or something like business hours and like, well, that's really good for research, but it's sort of a strange time to have these services available. It's not when you'd expect the highest demand for them.

Tom [00:16:56] That's right. Behavioral health crises probably don't keep banker's hours.

Jen [00:17:00] Yeah, that's right. That's right. Do you know if so in Denver was piloting this in just some districts was the goal here to be able to evaluate it or were they just piloting it in the sense of we're going to try it on a small scale first before we make it big?

Tom [00:17:14] Yeah. I don't know that they really planned for the type of evaluation you and I would recognize as being high causal standards. I don't think they did do a kind of internal evaluation report that was really process based and descriptive. I think they chose these eight districts because these were downtown districts where substance abuse and homelessness were really prevalent.

Tom [00:17:37] And, you know, and it's kind of an interesting place because it's gentrifying. Also, it had what we tend to call quality of life crimes.

Jen [00:17:46] Yeah. Yeah.

Tom [00:17:47] So anyway, that's where they they focused.

Jen [00:17:50] Yeah. Well good on Denver for having having this staggered rollout. Okay. And so putting those two pieces together, the eligibility criteria, so they're targeting certain kinds of calls and then they're doing this in some districts, but not others. How do you use these details to measure the causal effects of STAR?

Tom [00:18:06] Well, our core approach is easiest to conceive as a two step process. So first we look in these eight precincts where STAR was active both in the six months before it became active and in the six months of the pilot period when it was active. And we observed the levels of focal criminal offenses, those STAR related lower level offenses, and we'll calculate that change. How did the level of STAR focused crimes change as the program came on line? Now that difference, that change over time is going to include both the true effect of providing STAR services, but also the effect of everything else that might be changing over time and influencing the levels of crime.

Tom [00:18:54] So then we also construct a second change over the same time period in the other police precincts where STAR wasn't available. And looking at the comparative change, what we commonly call the difference in the differences is really the workhorse way to evaluate programs that roll out in this kind of staggered fashion. So that's fundamentally what we do. Now we're also we're able to leverage several other types of differences, more serious crimes, off hours as potential placebo conditions, but they're a little more qualified to give different types of insights and really the core design is that pre-post comparison across star precincts and precincts where STAR was unavailable.

Jen [00:19:39] Got it. Okay. And then what mechanisms should we have in mind here for how STAR might affect the various outcomes we might care about?

Tom [00:19:48] So we're focused on criminal offenses, and I think there are really two broad mechanisms here and it's an and it's important to package them because I think this has led to some confusion in understanding our results. So one broad mechanism is that when you send the STAR team rather than police to respond to the targeted calls, as we might expect, arrests to go down just because the police aren't there. Right. So there won't be a criminal offense recorded yet at some level, a crime still occurred maybe that person, someone in a psychotic break, was trespassing in a store or being a public nuisance, etc.. So there's a level one components of the results we find can be thought of to some extent as a reporting artifact, but I think it'd be wrong to deprecate the mechanism entirely in that way, because you have people in health crisis who are receiving crises, who are receiving health care instead of being shunted on a more of a criminal justice path.

Tom [00:20:53] And so there's a sense in which that's a really good social outcome. And again, we might say with the Denver police who have the CIT training, they might be expected to recognize mental health needs and direct people to care rather than arrest as well, but anyway, that's one broad mechanism, simply that the police aren't there to do the arrest and health care is being provided instead of a more criminal justice forward approach. Now, the second mechanism, though, and we see evidence for this existing in the data, is reducing recidivism, reducing repeat offending among people in behavioral health crises.

Tom [00:21:35] And the evidence for this in particular is that we see that in the STAR precincts when STAR is active, those STAR focused crimes go down during hours when STAR is not in operation. Now, we worry about that to some extent might that suggests there's a problem with our research design. I could talk about some checks for that, but what we believe it represents is that when someone is, say, in the middle of a psychotic break without STAR, they might get engaged by the police, maybe held in jail for a short period release and then immediately or soon thereafter, re-offend, but in a world where they're getting health care that kind of re-offending behavior won't necessarily occur to the same degree. So the fact that we're seeing reductions in those targeted crimes during the off hours suggests this kind of spillover effect that implies genuine reductions in crime, not just the kind of reporting mechanism and the redirecting from jail to health care that occurs through the first mechanism I described. So I think it's important to bear in mind that both might be both appear to be in play here with Denver's program.

Jen [00:22:53] Yeah. And something you mentioned about the officers who are responding, it made me think we should also just emphasize that this is a context where in the places without STAR, the officers all have CIT training. So in some ways, this is like the best case scenario as a as a comparison or as a counterfactual, right? I mean, CIT is the program that lots of police departments are putting in place to try to address these problems directly. So the fact that you're going to find effects above and beyond that is especially impressive.

Tom [00:23:25] Yeah, I think that's really I agree that that's really important, that the comparison condition involves the CIT training, but there's also that nuance I mentioned earlier that the treatment condition is the STAR program in the presence of that CIT training and that may improve the implementation of the community responders. So I think, again, it would be wrong to think of these programs as entirely siloed off from each other. And as we look to the future, we should embrace kind of collaboration between police and other types of first responders.

Jen [00:24:01] Yeah. Okay. You mentioned the data earlier and how important data are. So what data are you using to measure the effects of this program?

Tom [00:24:09] Yeah. So through its kind of open data framework. The City of Denver puts out very detailed incident level data on criminal offenses and this is something that feeds into their they participate in federal reporting through the national incident based reporting system. So we basically have early data that feeds into that, that important federal data collection, and it's at the incident level, but for our analysis, we aggregated it up to the precinct by month level. So in each of the 36 precincts over a 12 month period, six before the pilot and six while the pilot was active, we observed the level of STAR relevant criminal offenses as well as the number of more serious offenses that weren't the, you know, the focal point for the STAR program.

Jen [00:25:04] And so which outcomes are you most interested in here?

Tom [00:25:07] Well, you know, our immediate outcome are the STAR focused offenses, because we wanted to examine did the presence of the program lead to fewer recorded criminal offenses? Again, through those two broad mechanisms of directing offending individuals, individuals in crisis to health care rather than to jail, and through sort of broader reductions in crime.

Jen [00:25:32] And just to clarify, this could have gone the other way. Right. I'm not sure we've said that out loud yet. Like this it's not obvious this would be beneficial. It might be there are people out there who think, you know, if there if you reduce consequences for these low level offenses, people are it's just going to embolden people and they'll do more of whatever they were doing. So you might have seen crime go up or down.

Tom [00:25:53] That's right. And you might think the CIT training that the police have might make STAR irrelevant. Right. If police are already being really careful about identifying behavioral health crises and directing people to appropriate care. And second, as you mentioned, you have the concern that there might be escalation is one of the reasons we included those more serious crimes as a secondary outcome at our research.

Jen [00:26:21] Great. Okay. Well, let's talk about these results. What do you find is the effect of STAR on STAR related offenses?

Tom [00:26:28] So our main finding is that we you know, we observed that as STAR comes online, those lower level offenses that STAR focused on fell by 34% during the pilot period. And so to give you some sense of magnitudes here in these downtown precincts where STAR was implemented prior to STAR coming online, they averaged around 84 or so of those offenses per month. So we estimate based on a 34% reduction in that over eight precincts over each of six months, that star prevented roughly 1400 criminal offenses. And part of the reason we like to calculate that is just give people a sense of the magnitude, but also to check its face validity. And we did see that this kind of program induced reduction at a lower level criminal offenses is broadly consistent with the scale of STAR operations.

Tom [00:27:29] The STAR team over the six month period responded to 748 calls and at baseline in these precincts you would see about 1.4 offenses recorded for a STAR related type of incident. So that suggests we should expect about a thousand or so fewer offenses with 748 calls and 1.4 offenses per call and we see something a little more than that, which is consistent with those spillover benefits I was describing. And secondarily, we see those star related offenses also go down during hours when the program is actually not in operation. So those two findings suggest to us these kind of spillover benefits and the reduction in re-offending that might otherwise occur for people in behavioral health crises.

Jen [00:28:23] Yeah, because you can imagine that the thousand or so calls that you would expect to avoid by sending these health care workers instead of police is just over a mechanical effect. The police aren't there, so they can't report the incident as a crime and so it's those spillovers that I think are, it sounds like are the best evidence that this is a real improvement in public safety. Is that how you're interpreting that?

Tom [00:28:46] I think that's exactly right. And unfortunately, we couldn't, in Denver's publicly available data architecture, identify some other measure that might get us around that, say, just direct calls to 911, but we're undertaking replication studies in other communities where those data are available. So I think, though, there's there's room for this evidence base to grow to better understand those mechanisms.

Jen [00:29:12] Yeah. Excellent. Okay. And then what did you find was the effect of STAR on offenses that were not targeted by STAR?

Tom [00:29:22] Well, basically, across all types of analyzes, we find no statistically significant effect of STAR on more serious criminal offenses.

Tom [00:29:31] And I think there are two important and different ways to view this. And one is a concern you might have in looking at a program like STAR, we've heard people articulate this. You can't send social workers out to deal with crime in the absence of an armed police officer is going to lead to potentially violent escalations without someone there that can control a situation emphatically. We may have these sorts of unintended consequences. So the fact that we didn't see evidence of that is is really encouraging. It suggests the STAR team was really effective at de-escalating the calls they were sent out on. So so that's encouraging.

Tom [00:30:13] A second way to view this finding is as a type of affirmation of our results. So if we had seen, for example, a really dramatic decline in more serious offenses when STAR comes online as critical reviewers of this type of social science evidence, we might pause and say, is this design really reliable? If we're seeing big effects on outcomes that aren't real, you know, positive effects in terms of reducing that, maybe there's something else going on that just happens to correlate with the program. For example, maybe there's some seasonality in downtown precincts where in the second half of the year, all crimes decline. And we might be falsely attributing the impact of STAR to the impact of some other unobserved things that are changing uniquely in those downtown precincts when the program comes online. Well, the fact that we didn't see evidence of that is encouraging. It's a type of placebo results that gives us more confidence in the validity of our main finding.

Jen [00:31:18] Yeah, I agree. I really like that result as a placebo check of sorts. Okay. And so those are your main results.

Jen [00:31:26] But you also ran a whole bunch of additional tests and checks to kick the tires on these main results and to learn more about how this program is working. So maybe pick one or two of your favorite checks and talk about what they tell us.

Tom [00:31:38] Yeah, sure. With your forbearance, I'll name three. So one standard one is what we call an event study approach. And that's simply involves looking at the trends within the treated and the comparison districts prior to the program becoming active. And if our research design is reliable, we should see their trends in crime are quite similar before the program comes online. If instead the downtown precincts had been trending higher or lower relative to other precincts on these crimes, it might raise concerns with us. There's some some other time trends unique to these different communities that you might confound with the program. We don't see that the trends look parallel, so that's encouraging, but there are two other checks that I really want to stress that I think are really quite important.

Tom [00:32:32] One involves preregistration, so there's a broader concern in social science research right now, but many of the findings we report might be specious ones that reflect what we call p-hacking probability hacking. If you look at enough outcomes with enough specifications, something might appear to be statistically significant when in fact it's just a false finding, a false positive. And so one of the most powerful ways experimental researchers have adopted to address that concern is basically lashing themselves to the mast before they do their analysis by pre-registering in a very public way, here's what we're studying, here are the outcomes we're going to examine, here's how we're going to examine them. And so I think that's an incredibly important development in the field of social science research.

Tom [00:33:26] And I think the concern about p-hacking is maybe even more of a concern in the kind of quasi experimental research we're doing here, where we don't have a designed experiment and we're trying to set up and circumstances to mimic that experimental assignment. Well, that means you have a lot more researcher discretion, and p-hacking might even be more of a concern. The problem is it's hard to pre-register a quasi experimental design because going in you don't necessarily know exactly what data will be available or what will be the right research design that matches the circumstances, but we were comfortable, Dr. Pine and I, in adopting preregistration here because it was pretty clear exante before going in that this would be a kind of difference in difference differences design I described change in the treatment precincts relative to change in the control precincts. And we thought it would be really important in terms of the credibility of this finding to preregister that design before we did the analysis, before we looked at the outcome data.

Tom [00:34:34] So I think it's important for people to know that as a wonky aside for researchers like us, I would love to see the field in the near future take up the question of how can we adapt preregistration types of practices for quasi experimental designs? We can't do what experimentalists do. I think it's too constrictive for what the kind of activities that go on with quasi experimental research., but I think it would be really important to build the scientific validity of what we're doing in studies like this. So I was glad I could do it here. I'd love for the profession to develop ways to do that more broadly.

Tom [00:35:15] Okay. And one other robustness check. I mentioned the concern that, look, we're seeing STAR focused crimes fall in treatment precincts when STAR came online, but again, maybe we're just picking up something else related to seasonality of crime in these downtown precincts, particularly lower level quality of life crimes. Maybe there's that kind of false confounding going on. One of the other placebo checks we did that was really important, I think, was to go back to data from 2019, before STAR existed 2018 and 2017. And in each of those three calendar years, we took the corresponding data, implemented the same design and found null effects on these lower level star level crimes. So that with all the other evidence, of course, really increased our confidence that we're picking up the true causal link back to the program.

Jen [00:36:12] I really like that check too.

Tom [00:36:13] Yeah.

Jen [00:36:14] One quick thought on the preregistration piece. I thought it was super interesting that you guys did this. I think I am much more skeptical about the likelihood that we're going to figure out a way to do this consistently across quasi experimental studies. For all the reasons you mentioned, I just just the challenges of like what exactly are the data you have? And there's so much learning along the way in doing studies like this.

Tom [00:36:36] Yeah. Yeah, it may be fool's gold on my part, but I would at least like to for us to have a discussion about could we lay out the garden of forking paths? Could we lay out a type of decision tree that we intend to follow in their quasi experimental design, so that at least there's kind of hands above the table and people could go back and get a little bit of a window into the kind of researcher discretion that led to the findings. Again, it may not happen, but I would love for us to have that conversation because I think it would just really elevate if we could figure out a way that meets the kind of very sensible objections that you and I both have. It would really elevate the scientific validity of what we do.

Jen [00:37:26] Yeah, no, I agree. It's I completely agree. It's worth thinking about. I think the best argument I've heard for doing this, especially in the policy space, is that it's especially helpful to tie policymakers hands about which outcomes they care about. Because you hear a lot about, you know, there's some evaluation that sort of internal or with the research team and there are no stars on the outcome that everyone says they care about at first, but oh, there are some stars over here, and it turns out that's actually the outcome we were trying to change. And so just getting everyone to agree upfront about what the goal of the program is that they're evaluating can kind of help everyone be a little bit more intellectually honest about whether we're meeting the goals that we set out for ourselves.

Tom [00:38:08] I think that's hugely important, and that's one of the things that, you know, in the world of research or practitioner partnerships I think falls under the heading of capacity building, like building that kind of sensibility among practitioners. What are you really focused on? And let's take seriously the idea of a kind of inquiry cycle of assessing whether it had those impacts, where we surprised in some ways like that, but just having intentionality around using data and evidence to guide policy.

Jen [00:38:37] Yeah. Okay, so this was all your paper. Have any other papers related to this topic come out since you all first started working on this study?

Tom [00:38:47] We haven't seen any. I mean that this paper is still quite young, though, so there might be some imminent. You mentioned some of your students work. So I definitely.

Jen [00:38:57] I think there are a lot of students working on this these days.

Tom [00:39:00] Oh good and we are, too. And fortunately, there's a lot of policy innovation. And also one of the things I want to do is build a kind of way of synthesizing that evidence, both on who's adopting these policies as well as what it back that that they have. So to to answer your question directly about aware of other evidence, but I know it's likely to be coming and.

Jen [00:39:24] Yeah.

Tom [00:39:24] I really want to create a kind of infrastructure that organizes that evidence and connects it to policymakers and practitioners, because this is such a common sense and compelling kind of programmatic innovation.

Jen [00:39:38] Mm hmm. And so at this point, what are the policy implications of your results and what we've known beforehand? What should policymakers and practitioners take away from this?

Tom [00:39:49] Yeah, I guess I want to strike a tone of optimism, but of cautious optimism. I think the main implication is that first responder innovations like community response, like co-response have considerable promise. And one of the things to note about that promise to is it can benefit from a kind of political support whose breath is uncommon in these divisive times. I mean, to put it bluntly, I like I sometimes tell people whether your politics are back the blue or defund the police, there's something for you to like about these types of programs. Right. Because for back the blue folks, they can point to the fact that police are often like these kinds of programs and express dissatisfaction with having to be social workers in the field and be health care specialists in the field. And for those who are concerned about the operating footprint of the police, these kind of programs provide a way to shrink that and sensible ways and maybe create a case for shrinking police budgets over the longer term.

Tom [00:40:58] So anyway, there's there's promise in these innovations. Pilots like this suggests they're impactful. There could be political support, but again, I would also stress and caution. We often see well-implemented policies in a pilot context, having, you know, encouraging effects, but then years later, feeling as if that was lightning in a bottle that we couldn't replicate or scale that success. And so as other communities begin to take this up, I would really stress the importance of high quality implementation, fidelity, having coordination between police, mental health providers, dispatchers, paramedics, so that everyone's singing from the same page in the same hymnal as they go about this work.

Tom [00:41:47] And secondarily, having what we sometimes call an improvement science mindset where we're collecting data and evaluating impact and using it not just summative but formidably to guide refinements and improvement in those programs because again, I'm extraordinarily encouraged by what we saw in the Denver pilot, but its success elsewhere, it's by no means a foregone conclusion.

Jen [00:42:12] Yeah, I'll add that in addition to having all of those different practitioners working together in sync, it's helpful to have an academic come in and talk with everybody to to make sure there's going to be an actual way to figure out if this worked, but short that the model you all have in this paper, I think will be a good model for other districts. You know, at the very least, being able to pilot things in some areas of the city and not others gives people like us something to work with when we're thinking about whether that program had an effect down the road. Okay. So in addition to continue to test to see if this type of program works elsewhere, what else is on the research frontier in this space? What are the other big questions in this area that you and others are going to be thinking about?

Tom [00:42:54] Yeah, I think an important question is, is beginning to end package as we study these programs, what design features work best for specific contexts? And that's across co-response, community response types of models.

Tom [00:43:10] There are a number of features of communities, for example, that might mediate what they can provide and what's likely to be most impactful. It's partly going to be a function of, well, what kinds of behavioral health issues are extant in their communities, but also just the physical structure. So for example, I was talking with a reporter in South Dakota recently who was saying, how could this work in a rural community? And we had a really interesting discussion about that. And one of because you could imagine in a place where it's sparsely populated, covering a lot of physical area, having a full time team of community responders might not be feasible, might not be cost effective, or even having a co-response person who could get anywhere across a far flung space in a timely manner might not be reasonable.

Tom [00:44:04] And so I had heard that some rural communities were experimenting with having a kind of FaceTime availability of a co-responder where the responding police officer would have access to that person, could speak with them, maybe have that person speak with the person in crisis to make a decision about do we need to put a hold on this person or could we direct them to some other service that might meet their needs effectively? What's the best response? So I thought that was an interesting use of technology in a rural setting, and I'm sure there are countless other design features relevant to different community contexts, and I think we should foreground that the beginning of building this evidence base, the importance of that heterogeneity, rather than think of these initiatives as something you just take off a shelf and plug it always and everywhere with a one size fits all mentality.

Jen [00:44:58] Yeah, I'm also thinking of the the folks who will be skeptical of these findings and it's going to be the folks who are on the tough on crime side who are very worried about, you know, if there are no consequences for low level offending, that offending is just going to escalate. I think the next piece that I'll be really curious to see is what happens to recidivism or the likelihood that a particular like being able to follow a particular person when they have a STAR responder versus a police officer respond to their incident. What happens to that person going forward? Do they wind up being rearrested more or less often in the future or something like that? Being able to track people over time.

Tom [00:45:39] Yeah, I agree. That'll be important. And I also think as these movements go forward, really politically savvy people will seek to build them as coalitions too, because there is a shared purpose around the relevant outcomes. And keeping the focus point on that I think can, you know, create a big tent that will allow us to explore and realize the full possibilities of these types of programs.

Jen [00:46:06] Yeah. Lots of common ground to seize here.

Jen [00:46:09] My guest today has been Tom Dee from Stanford University. Tom, thank you so much for talking with me.

Jen [00:46:14] Thank you for having me. I really enjoyed it.

Jen [00:46:21] You can find links to all the research we discussed today on our website probablecausation.com. You can also subscribe to the show there or wherever you get your podcasts to make sure you don't miss a single episode. Big thanks to Emergent Ventures for supporting the show and thanks also to our Patreon subscribers and other contributors. Probable Causation is produced by Doleac Initiatives, a 501(c)3 nonprofit, so all contributions are tax deductible.

Jen [00:46:45] If you enjoy the podcast, please consider supporting us via Patriot or with a one time donation on our website. Please also consider leaving us a rating and review on Apple Podcasts. This helps others find the show, which we very much appreciate. Our sound engineer is Jon Keur with production assistance from Nefertari Elshiekh. Our music is by Werner and our logo was designed by Carrie Throckmorton. Thanks for listening and I'll talk to you in two weeks.

Jennifer DoleacOctober 11, 2022Policing, Health, Difference-in-Differences