Theories of rational behavior assume that actors make decisions where the benefits of their acts exceed their costs or losses. If those expected costs and benefits change over time, the behavior will change accordingly as actors learn and internalize the parameters of success and failure. In the context of proactive policing, police stops that achieve any of several goals—constitutional compliance, stops that lead to “good” arrests or summonses, stops that lead to seizures of weapons, drugs, or other contraband, or stops that produce good will and citizen cooperation—should signal to officers the features of a stop that increase its rewards or benefits. Having formed a subjective estimate of success (i.e., prior beliefs), officers should observe their outcomes in subsequent encounters and form updated probability estimates, with specific features of the event, with a positive weight on those features. Officers should also learn the features of unproductive stops and adjust accordingly. A rational actor would pursue “good” or “productive” stops and avoid “unproductive” stops by updating their knowledge of these features through experience. We analyze data on 4.9 million Terry stops in New York City from 2004–2016 to estimate the extent of updating by officers in the New York Police Department. We compare models using a frequentist analysis of officer behavior with a Bayesian analysis where subsequent events are weighted by the signals from prior events. By comparing productive and unproductive stops, the analysis estimates the weights or values—an experience effect—that officers assign to the signals of each type of stop outcome. We find evidence of updating using both analytic methods, although the “hit rates”—our measure of stop productivity including recovery of firearms or arrests for criminal behavior—remain low. Updating is independent of total officer stop activity each month, suggesting that learning may be selective and specific to certain stop features. However, hit rates decline as officer stop activity increases. Both updating and hit rates improved as stop rates declined following a series of internal memoranda and trial orders beginning in May 2012. There is also evidence of differential updating by officers conditional on a variety of features of prior and current stops, including suspect race and stop legality. Though our analysis is limited to NYPD stops, given the ubiquity of policing regimes of intensive stop and frisk encounters across the United States, the relevance of these findings reaches beyond New York City. These regimes reveal tensions between the Terry jurisprudence of reasonable suspicion and evidence on contemporary police practices across the country.