11 Nov 2017

Subscribe to Newsletter

NYC, like any city, has rats, and a lot of them - a study done in 2014 estimated that there are around 2 million rats who call NYC home. Most of those rats are the exotic Norwegian Rat (spoiler, they most likely did not actually come from Norway). But thanks to NYC’s Open Data portal, we can learn a lot more about our furry little cohabitants.

We can start by filtering the 311 data data to see the distribution of rat sightings around the city. In total, since 2010 there have been 78k rat sightings (and 25k mouse sightings) reported by New Yorkers. Which borough is the most rodent infested? That distinction goes to Brooklyn, with a combined 34k rat and mouse sightings reported. Then comes Manhattan with 27k, the Bronx with 21k, Queens with 15k, and last is Staten Island, with only 4.5k sightings. I guess rats don’t like commuting on the Staten Island ferry.

Rat Sightings

Rats versus mice

There are more then 3x as many rat reportings as there are mice reporting in NYC. But are they geographically distributed in the same regions? Are there “rat territories” and “mice territories” in the city? Turns out, yes - here is a plot of the difference in rat versus mice reporting by census tract. Areas in red have more rats being reported, areas in blue have more mice. Rat Vs Mice

Rats appear to have a firm grip on most of Manhattan and Downtown Brooklyn, but as you get further out in Brooklyn and Queens there are spots where mice run the show.

When do rats show up?

Looks like they start scampering about around 10am, and then sightings taper off throughout the rest of the day. Note, this is based on human reported sightings, so perhaps New Yorkers are just most startled/disgusted by them in the morning, but when the evening rolls around they are used to them.

Rat Time of Day

And rats tend to mingle when the weather is nicer outside. Below is the chart of the number of sightings in different months of the year - rat sighting peak in July, stay steady into October, and then drop precipitously for the winter.

Rat Month of Year

What are we doing to stop them?

Recently De Blasio kicked off a $32 million rat reduction plan, so the rat population is definitely under attack. NYC offers up records of where they are doing their inspections and baitings, which allows us to isolate rat “safe zones” - areas where there are high concentrations of rat sightings, but low concentrations of inspections. Feel free to share this with your local community rat inspector (or with your rat population, depending on whose side you are on):

Rat Safe Zones

Rats are the most aggressively hunted in Manhattan; it has the most inspections or baitings (470k). There are spots within Brooklyn and Staten Island where there has historically been lower coverage despite high rat populations. And overall, Brooklyn has had only the third most inspections, 225k, less than half that of Manhattan. This runs counter to the fact that Brooklyn has 25% more rodent sightings reported than Manhattan. Maybe an area for some of that $32 million to go?

What cuisines do rats fancy?

Thanks to the restaurant inspections dataset we can see which NYC restaurants have had a problem with rodents at some point in their operation. Turns out, most restaurants have had an issue - 63% of all NYC restaurants have been cited at some point for either mice or rats being present in their facilities. But which cuisines do rats prefer the most?

Rodent Cuisine

The winner is Latin food, with 80% being cited for rodent problems. Caribbean (79% cited), Delicatessen (75% cited), Indian (75% cited), and Thai (74.5% cited) round out the top five cuisine types where you are most likely to run into a rat. Looks like rats generally stay away from juice and salad places (39% cited), as well as coffee (45% cited), and ice cream (46% cited).

I have to admit, I was really hoping that French food was going to top the list, but that is firmly in the middle, with 64% being cited.

What are rats worst enemy?

Apparently terriers - go figure.

nyc rodents resturants 311 opendata

22 Oct 2017

Subscribe to Newsletter

The NY state legislature’s voting record exposes an odd fact: they all almost always agree. The Open States Project organizes all of the available legislative voting data, so you can pull it down and see for yourself. For the past six years, in all but a handful of cases, every bill introduced for a floor vote is passed and usually with a margin of about 90%. How could this be?

Here is the agreement rate for the lower chamber of the legislature:

Lower Chamber Agreement

And here is the agreement rate for the upper chamber of the legislature:

Upper Chamber Agreement

This is a mystery, and I do not have a good answer. I would expect some healthy discord and debate, resulting in voting records indicating closer outcomes and more bills that failed to pass. Instead, voting records indicate that in the past 7 years only 4 bills failed their floor votes in the lower chamber, and only 1 in the upper chambers. That is out of a total of 7269 votes for the lower chamber (so a 0.06% failure rate) and 11155 votes for the upper chamber (less than a 0.009% failure rate). And as demonstrated above, when they pass (which is overwhelmingly most of the time), they pass with a super majority.

What could be going on here? Here are a couple of theories:

  1. Voting is done in secret - votes could be conducted behind closed doors, without the record keeping that makes it transparent to the public. Seems unlikely, but if it were the case then they might only bring things to the floor which are certain to succeed.
  2. Top party members dicate the vote - if all of the power brokering is done at the top, and votes are decided by the party leadership ahead of time, you might expect there to be little recorded discent in the actual votes. But this implies that most state legislatures are sheep, and their only impact is the weight they give to their party leader.
  3. Voting no is is policitically non-viable - it could be the case that voting no on a bill is bad optics, and therefore every bill that makes it to a vote always gets a yes vote from everyone.

Or there is some other reason. But whatever the case may be, the consent and passage rate of the NY state legislature is a surprise, and challenges my assumptions of what a healthy democracy looks like.

ny politics opendata

01 Aug 2017

Subscribe to Newsletter

Sometimes when it is hot out there is nothing more refreshing than escaping the heat in the shade of a tree. Luckily, in NYC there are thousands of trees planted along our streets, and NYC has done a great job in distributing them around the city. And better yet - thanks to the efforts of TreeCount NYC, we know the exact location and health of all of them!

The tree count is done once every 10 years, and the first time we have data for it is back in 1995. Since then, New Yorkers have planted close to 200,000 trees along the streets - thanks in part to initiatives like the million trees project. Nice job everyone! As of the 2015 count, there are 683,774 trees planted along our city streets. In an absolute sense, Queens ranks as #1 with 250,490 trees, Brooklyn is in second with 177,276 trees, and Staten Island takes the bronze with 105,317 trees (the Bronx has 84,880 and Manhattan rounds out the bottom, with 65,722 trees). Note, we are talking about trees planted along streets, so places like central park do not count.

Here is what the counts look like overall, broken down by census tract: Tree Counts NYC 2015

Since census tracts are of varying sizes, a slightly cleaner way to look at it is to consider it by density - if we normalize by the size of the census tract, we get the following: Tree Density NYC 2015

The map makes some intuitive sense - there are high tree densities in mostly residential neighborhoods, whereas industrial areas (and airports) are much lower. And it is no surprise that the financial district and midtown have some of the lowest tree density rankings in the city. There is a bit of a conspicuous blight in southern Queens (in the Blissville/Laurel Hill neighborhoods), but those neighborhoods are very industrial.

Let’s zoom in on Brooklyn: Tree Density Brooklyn 2015

The BK neighborhood with the most street trees is… East New York, with 9603 trees! This is largely due to its size. When we normalize by area it drops to third place; the top 5 neighborhoods are Bookyn Heights (4690 trees/sq mile), Prospect Heights (4258 trees/sq mile), East New York (4222 trees/sq mile), Windsor Terrace (4170 trees/sq mile), and lastly, Williamsburg (4107 trees/sq mile). Coney Island bottoms out the list, with 1123 trees/sq mile, less than a quarter of the density of Brooklyn Heights. Lookup where your neighborhood ranks here.

I ran regressions/correlations comparing the tree counts in each area against other factors, like the age of the neighborhoods, the income of residents, demographic make up, etc. Somewhat surprisingly, only very weak relations emerged - suggesting that those factors do not play a part in where we decide to plant trees. The strongest relationship was with age of buldings, but even that was only -24% correlated with the number of trees planted in an area.

Further Reading

There is a lot of great analysis and reading you can do about the trees in NYC. The tree count website has some interesting summary statistics/charts. I Quant NY has a good writeup, in particular where he factors in the total length of streets in each neighborhood. And there was even a data jam devoted to interesting ways to visualize/interpret the tree count data.

How to Get Involved

We can all help make our city a greener place - and after the next tree count in 2025 it would be nice to see the light spots in the maps above to get a little shade! Here are some ways you can get involved:

nyc treecount opendata

10 Jun 2017

Subscribe to Newsletter

Its summer time, and that means its time to leave the city and head to the mountains, trails, cliffs, and rivers. First stop - Catskills! For today’s analysis let’s look at the rental market in the lower Catskills region. AirBnB has transformed the vacation rental market there in the past couple of years, but what drives a good rental property on AirBnB? What factors in that area lead to a property being able to fetch a higher price?

I took a look at AirBnB properties in three town in the southern Catskills - Pine Bush, Gardiner, and New Paltz. These towns are all around one of my favorite areas - the Mohonk Preserve. There are 136 properties, with a median rental price of $155 per night. New Paltz has the most, at 57 properties (median price $149), then Pine Bush (median price $165), and finally Gariner (median price $145). There are a couple of factors in a home rental that seem to drive the price.

Room Type

AirBnB offers up rentals for private rooms, shared rooms, and whole home rentals. In the Catskills, 63% of the homes are whole home rentals.

Room Type by Town

As expected, there is a big difference in price for whole home rentals ($195 median price) versus just a room ($100 median price).

Capacity

How many people a house can hold is the biggest factor in house price, and there is a pretty good linear relationship between capacity and price. The most prevalent property size is for 2 people (33% of all properties), because of the prevalence of single room rentals. Next is 4 person rentals (21%), then 6 person rentals (14%).

Capacity to Price

Interestingly, if we look at the relationship of capacity to rental price by town, we can see that Gardiner and New Paltz have a slight edge over Pine Bush.

Capacity to Price By Town

Ratings

A big part of the AirBnB experience is the rating system. As a renter, you can rate your host on a number of factors. So how do these factors get reflected in rental prices? Looking at overall satisfaction, cleanliness, and communication, it appears like these ratings are a noisy contributor to the nightly price.

Instant Book

Some properties on AirBnB have the option to instantly book them, rather than going through the negotiation with the property owner/manager. For smaller properties there is not much of a difference, but for 6 person properties instant book homes are on average $67 more per night, and for 8 person homes they are on average $150 more per night. Overall, only about 20% of homes are listed as instant book.

Instant Book Impact

Analysis

All of these factors independently contribute to the house rental price, but let’s build a combined pricing model to see how they interact together. I plugged the rental properties into a linear regression, minimizing the L2 norm with OLS. The scatter plot of the predicted price versus the actual price for the training data is plotted below:

Model Predicted versus Actual

The adjusted R-squared is .66, and the regression F-stat is 27.5. Most of the factors have t-stats over 2 - as mentioned above, the rating factors have the most noise.

If we translate the results of the regression into a more comprehensible pricing model, we end up with something like this:

Pricing Model

The negative baseline is obviously unrealistic on its own, but with most pricing options added in the outcome becomes feasible (ie, positive). For instance, for a 3 bedroom house in Pine Bush with good reviews but no instant booking, the model predicts a nightly price of $311 = - $147 (baseline) + (6 x $55)(capacity) + ($7 + $15 + $9)(good ratings) + $21 (Pine Bush). Capacity is far and away the biggest driver of price, but this model has some interesting outcomes. For instance, instant book is worth quite a lot ($45 per night).

With a median price of $155 per night, there is really no reason not to go explore some of what the Catskills has to offer. Go check out the Mohonk Preserve, Minnewaska State Park, the Shawangunk Wine Trail, or the awesome tasting room at Angry Orchard.

ny airbnb catskills rental home