Data & Methodology

Data & Limitation
We obtained the raw data in the csv format of crime statistic in Maryland from dataMontgomery, and we transcibed the climate data from BestPlaces into a csv file.

We were unable to obtain a real-time weather data through OpenWeatherMap API request as the historical data that is beyond 5-day-period would require a subscription. Instead, we used the climate data to get the average temperature of each month. As a result, we only focused in analyzing the crime data in only one city. We selected the city with the highest number of crimes, Silver Spring.

Data Clean-Up
To get a table of clean data for our analysis, we completed the following steps:

  • imported the csv data;
  • extracted the data columns of interest;
  • removed entries not considered as crime data;
  • reformatted the date & time columns to be in date/time format;
  • extracted only data from 'Silver Spring';
  • located the first and last date of the data;
  • removed data that does not have a complete year

After looking at our cleaned data, we decided to focus our anaylsis from 2017-2019 on the following crime categories:

  • Crime Against Property
  • Crime Against Person
  • Crime Against Society

Data Analysis
For each of the crime cateogorie, we visualized the number of crimes for each month in a year. We took the average of the number of crime in eaach month before we brought in the climate data to performe linear regression. Then, we binned our data into 4 different temperature categories,and performed ANOVA hypothesis tests to determine whether there is any statistical significance in the number of crime for different temperature category.