I’m Josh - an Applied Data Science Lead at Civis Analytics, a native Virginian, and a Texas transplant. I use quantitative analysis techniques to help progressive candidates, campaigns, and causes make the most of the information available to them and make the best resource allocation decisions possible on tight budgets and timelines. My work at BlueLabs falls into a few buckets:
Electing Great Candidates
In 2020, working for BlueLabs, I worked with Mike Espy’s senate race to develop a campaign plan targeting the right voters through the right media to try and flip a tough Deep South conservative state for a historic Black Democratic candidate. While we came up short, Espy ran 3.5 points ahead of the top of the Democratic ticket, more than all but one other senate race in 2020.
After the election, I conducted several detailed post-election analyses, including an analysis of Latino voters’ support and enthusiasm nationally, local outcomes in the largest Metro area in the Texas, and election outcomes in key Congressional districts in Texas and Florida–all with an eye towards shaping winning electoral strategies for 2022 and beyond.
Recent research projects at Civis include forecasting state legislative outcomes in Virginia and New Jersey, polling attitudes about SB8 in Texas, and post-election analyses of Latino voters’ Democratic support and enthusiasm nationally, local election outcomes in the largest metro area in the Texas, and election outcomes in key Congressional districts in Texas and Florida–all with an eye towards shaping winning electoral strategies for 2022 and beyond.
Committees and Independent Expenditures
My first project at BlueLabs was to work as an embedded consultant in the DSCC Analytics team. In that capacity I helped coordinate the delivery and implementation of analytics assets including individual models and polls, administered access to key tools like the Civis Platform and Votebuilder, and liaised with state data directors to help them make the most of the committee’s resources.
Between federal cycles, I worked with one of the largest independent expenditures investing in Virginia state house races in 2019. My team identified the most competitive races where outside resources could be most impactful. This project relied heavily on quantitative research, but also relied on significant contextual understanding of the political environment when evaluating local races which often featured sparse data.
Voter Registration Targeting
Unregistered voters are, by definition, very difficult to find using the registered voter lists our industry typically employs to target political outreach. I led the development of a product at BlueLabs called Tallia that helps organizations target registration programs based on the geographic distribution of likely unregistered voters. To produce these estimates, my team compared US Census Bureau data with data from the voter file to identify which Census blocks, zip codes, counties, and districts likely had large numbers of eligible voters who are not on the voter file. As part of this project I also wrote a detailed memo on existing research into tactics and effects of voter registration activities, which was distributed to a number of organizations.
As a twist on this work I also helped the National Low Income Housing Coalition (NLIHC) develop and deploy individual-level data assets to their network of property managers to register voters at participating housing complexes. This project sought to directly empower residents in subsidized housing by ensuring they were able and ready to participate in the 2020 elections. My contribution was to help design data intake templates, build a data pipeline to support over 25 participating organizations, and deliver custom lists of currently registered voters to the client in a format that they could cross-reference with residency lists for targeted outreach.
Message Testing and Targeting
In the run-up to 2020 many organizations were trying to find the most effective ways to talk to disaffected Trump supporters and other segments of potential swing voters. My team conducted a messaging experiment for one of these organizations focused on winning back Trump Defectors in WI, MI, and PA. These were voters who were likely to have voted for Trump but were likely to also have voted for a Democrat in 2018 - a hotly contested voting bloc in 2020. Our experiment tested four video messages in an online survey context and determined segments of voters who a) responded more one message or the other and b) responded to different types of messengers. We identified the right messaging for our target voters and the right messengers.
Once we find the right message and messenger for a candidate, I’m often tasked with targeting who we talk to. I targeted outreach at the individual individual level - based on cookies or mailing addresses - and through mass media like zip-targeted digital ads, broadcast TV ads, and even billboards! The goal is to understand the size of your target audience, how much of a given region is made up of target voters, and which voters might backlash - or support your candidate less after hearing their message - and blend those three factors into one final decision.
Throughout 2020 I also had the opportunity to apply my targeting skillset to help a global foundation shift food aid resources to the regions in the US where the COVID-19-driven economic collapse has driven the greatest need. Our team overcame several challenges to provide the best estimates of unemployment rates, food insecurity, childhood food insecurity, and healthy food availability at the county level national-wide. We did this by combining a variety of data sources from several federal government agencies and non-profits like Feeding America and analyzing what makes food insecure parts of the country different from those with better access.
That summer, as the COVID recession diminished consumer’s purchasing power, much of the media’s focus on food insecurity was on previously well-off suburbs that suddenly had long lines at the local food pantry. Our research, however, suggested that food insecurity was still in the pandemic era was worst in the places it has always been bad - the Deep South’s Black Belt, the Rio Grande Valley and other heavily Hispanic regions of the West, Indian Country in the West and Northern Plains, and Appalachia. The pandemic didn’t change who was food insecure so much as it left everyone worse off - and those who could least afford the hit bore the brunt of it. Our work helped this organization maintain focus on the hardest hit parts of the country in a moment those people needed help the most.
Why I Wage Hope: A Son’s Story
Our paths are all unique, but we are united in a common understanding that my mother professed throughout her battle: As long as you are alive, there is hope. (more)
Slate’s The Works
Prudie Fans Don’t Read Politics, and Other Things We Learned by Analyzing the Habits of Slate Readers
Slate publishes a wide variety of articles ranging from culture and sports to politics and the merits of goat spamming every day. We’ve always been able to sort the more widely read articles from the less popular fare, but we’ve never been able to tell whether the readers who come to us for judicial coverage are also interested in breaking news via Slatest or getting advice columns like Dear Prudence. So we wanted to find out who Slate’s largest and most distinct groups of readers are. (more)
General Assembly Blog
A Beginner’s Guide to Ridgeline Plots
Decision-makers need to understand [sampling] error to make the most of survey results, so it’s important for data scientists and analysts to communicate confidence intervals when visualizing estimated results. Confidence intervals are the range of values you could reasonably expect to see in your target population based on the results measured in your sample. (more)