Bayes Hack 2016 |
Department of Commerce Brief.

How can data protect consumers from financial foul play?

More than two years after the Consumer Financial Protection Bureau began collecting complaint data, the consumer complaint database is now a public repository of over 100,000 consumer complaints. It's a rich resource for CFPB analysts and financial institutions searching for emerging trens about consumer complaints relating to financial services products, including complaint justifications and responses or resolutions.1

The CFPB forwards complaints to the appropriate company for a response, analyzes complaint data to provide regulatory oversight, enforces federal consumer financial laws, and writes better rules and regulations.2 Now the CFPB is exploring how this data can be transformed into smart tools to optimize internal processes, improve consumer complaints, or to focus scrutiny on shady practices.

How can data from satellites mold the geographies of economic activity?

NOAA collects detailed images of the Earth every day, 24/7. The Suomi NPP Satellite Mission in particular collects imagery of the earth along a polar orbit. The data collected by this mission has been of particular significance for earth scientists, but the data can be used for far more than atmospheric measures. The Visable Infrared Imaging Radiometer Suite (VIIRS) collects data both during the day and at night, enabling an unprecedented set of capabilities to detect adverse events, energy production, and economic activity. VIIRS data also holds the key to enhance spatial and temporal resolution monitoring of economic and demographic activity.

The Department of Commerce wants to use VIIRS data to create higher resolution estimates of lagging indicators, whether geographically or temporally. What potential new applications and mashups can be generated from the VIIRS data? As it turns out, the federal government publishes a wealth of data at the county level, ranging from labor data to energy data to population, personal incomes, and trade metrics. Satellite imagery can be processed into a form usable at the county level so that it is mashable with commonly available data, and transformed into insight into geographic trends at a far more granular level than conventional censusing or other collection methods.

How can data measure marine biodiversity to update conservation efforts in real time?

Part of the mission of the Marine Mammal Laboratory (a department of AFSC/NOAA Fisheries) is to document the distribution and relative abundance of bowhead and other whales, including in areas under consideration for oil and natural gas exploration, development, and production. Through this biodiversity monitoring, the MML can balance industry and government hydrocarbon exploration.

With this goal in mind, the MML recently completed aerial surveys of Arctic whales using high resolution digital camera systems. Unfortunately, the scale of the dataset presents tremendous difficulty for an agency that needs to provide policy guidance on time-sensitive regulatory decisions.

That scale also makes the whales dataset an ambitious hackathon project, but a filtering workflow that eliminates unusable images (characterized by cloud cover, glare, or chopy ocean conditions) would dramatically decrease the problem size for mammal-counting via computer vision.


Consumer Financial Protection Bureau:

  • The Bayes Impact starter kit, an exploration of this prompt's key datasets.
  • CFPB Consumer Complaint Database
  • Yelp business reviews offer a unique opportunity to incorporate less formal consumer complaints to prioritize and evaluate complaints filed with the CFPB. The Yelp API is broadly useful here, though rate limited, but the Yelp Dataset Challenge offers cleaned data in bulk for select cities (including Pittsburgh, Charlotte, Urbana-Champaign, Phoenix, Las Vegas, and Madison).
  • The Commerce Data Service offers a tutorial for merging the American Community Survey's public datasets with the Consumer Complaint Database to highlight areas of specific need within particular communities: "Extending the ACS: Data-driven Outreach."

VIIRS satelite data:

MML whales image data:

  • Did we mention this dataset is very big? @mention an organizer on Slack for access.
  1. CFPB's consumer complaint database: analysis reveals valuable insights by Kiefer et al. for Deloitte LLP.

  2. Consumer Financial Protection Bureau: How we use complaint data