01 Aug 2017

Subscribe to Newsletter

Sometimes when it is hot out there is nothing more refreshing than escaping the heat in the shade of a tree. Luckily, in NYC there are thousands of trees planted along our streets, and NYC has done a great job in distributing them around the city. And better yet - thanks to the efforts of TreeCount NYC, we know the exact location and health of all of them!

The tree count is done once every 10 years, and the first time we have data for it is back in 1995. Since then, New Yorkers have planted close to 200,000 trees along the streets - thanks in part to initiatives like the million trees project. Nice job everyone! As of the 2015 count, there are 683,774 trees planted along our city streets. In an absolute sense, Queens ranks as #1 with 250,490 trees, Brooklyn is in second with 177,276 trees, and Staten Island takes the bronze with 105,317 trees (the Bronx has 84,880 and Manhattan rounds out the bottom, with 65,722 trees). Note, we are talking about trees planted along streets, so places like central park do not count.

Here is what the counts look like overall, broken down by census tract: Tree Counts NYC 2015

Since census tracts are of varying sizes, a slightly cleaner way to look at it is to consider it by density - if we normalize by the size of the census tract, we get the following: Tree Density NYC 2015

The map makes some intuitive sense - there are high tree densities in mostly residential neighborhoods, whereas industrial areas (and airports) are much lower. And it is no surprise that the financial district and midtown have some of the lowest tree density rankings in the city. There is a bit of a conspicuous blight in southern Queens (in the Blissville/Laurel Hill neighborhoods), but those neighborhoods are very industrial.

Let’s zoom in on Brooklyn: Tree Density Brooklyn 2015

The BK neighborhood with the most street trees is… East New York, with 9603 trees! This is largely due to its size. When we normalize by area it drops to third place; the top 5 neighborhoods are Bookyn Heights (4690 trees/sq mile), Prospect Heights (4258 trees/sq mile), East New York (4222 trees/sq mile), Windsor Terrace (4170 trees/sq mile), and lastly, Williamsburg (4107 trees/sq mile). Coney Island bottoms out the list, with 1123 trees/sq mile, less than a quarter of the density of Brooklyn Heights. Lookup where your neighborhood ranks here.

I ran regressions/correlations comparing the tree counts in each area against other factors, like the age of the neighborhoods, the income of residents, demographic make up, etc. Somewhat surprisingly, only very weak relations emerged - suggesting that those factors do not play a part in where we decide to plant trees. The strongest relationship was with age of buldings, but even that was only -24% correlated with the number of trees planted in an area.

Further Reading

There is a lot of great analysis and reading you can do about the trees in NYC. The tree count website has some interesting summary statistics/charts. I Quant NY has a good writeup, in particular where he factors in the total length of streets in each neighborhood. And there was even a data jam devoted to interesting ways to visualize/interpret the tree count data.

How to Get Involved

We can all help make our city a greener place - and after the next tree count in 2025 it would be nice to see the light spots in the maps above to get a little shade! Here are some ways you can get involved:

nyc treecount opendata

10 Jun 2017

Subscribe to Newsletter

Its summer time, and that means its time to leave the city and head to the mountains, trails, cliffs, and rivers. First stop - Catskills! For today’s analysis let’s look at the rental market in the lower Catskills region. AirBnB has transformed the vacation rental market there in the past couple of years, but what drives a good rental property on AirBnB? What factors in that area lead to a property being able to fetch a higher price?

I took a look at AirBnB properties in three town in the southern Catskills - Pine Bush, Gardiner, and New Paltz. These towns are all around one of my favorite areas - the Mohonk Preserve. There are 136 properties, with a median rental price of $155 per night. New Paltz has the most, at 57 properties (median price $149), then Pine Bush (median price $165), and finally Gariner (median price $145). There are a couple of factors in a home rental that seem to drive the price.

Room Type

AirBnB offers up rentals for private rooms, shared rooms, and whole home rentals. In the Catskills, 63% of the homes are whole home rentals.

Room Type by Town

As expected, there is a big difference in price for whole home rentals ($195 median price) versus just a room ($100 median price).

Capacity

How many people a house can hold is the biggest factor in house price, and there is a pretty good linear relationship between capacity and price. The most prevalent property size is for 2 people (33% of all properties), because of the prevalence of single room rentals. Next is 4 person rentals (21%), then 6 person rentals (14%).

Capacity to Price

Interestingly, if we look at the relationship of capacity to rental price by town, we can see that Gardiner and New Paltz have a slight edge over Pine Bush.

Capacity to Price By Town

Ratings

A big part of the AirBnB experience is the rating system. As a renter, you can rate your host on a number of factors. So how do these factors get reflected in rental prices? Looking at overall satisfaction, cleanliness, and communication, it appears like these ratings are a noisy contributor to the nightly price.

Instant Book

Some properties on AirBnB have the option to instantly book them, rather than going through the negotiation with the property owner/manager. For smaller properties there is not much of a difference, but for 6 person properties instant book homes are on average $67 more per night, and for 8 person homes they are on average $150 more per night. Overall, only about 20% of homes are listed as instant book.

Instant Book Impact

Analysis

All of these factors independently contribute to the house rental price, but let’s build a combined pricing model to see how they interact together. I plugged the rental properties into a linear regression, minimizing the L2 norm with OLS. The scatter plot of the predicted price versus the actual price for the training data is plotted below:

Model Predicted versus Actual

The adjusted R-squared is .66, and the regression F-stat is 27.5. Most of the factors have t-stats over 2 - as mentioned above, the rating factors have the most noise.

If we translate the results of the regression into a more comprehensible pricing model, we end up with something like this:

Pricing Model

The negative baseline is obviously unrealistic on its own, but with most pricing options added in the outcome becomes feasible (ie, positive). For instance, for a 3 bedroom house in Pine Bush with good reviews but no instant booking, the model predicts a nightly price of $311 = - $147 (baseline) + (6 x $55)(capacity) + ($7 + $15 + $9)(good ratings) + $21 (Pine Bush). Capacity is far and away the biggest driver of price, but this model has some interesting outcomes. For instance, instant book is worth quite a lot ($45 per night).

With a median price of $155 per night, there is really no reason not to go explore some of what the Catskills has to offer. Go check out the Mohonk Preserve, Minnewaska State Park, the Shawangunk Wine Trail, or the awesome tasting room at Angry Orchard.

ny airbnb catskills rental home

27 Mar 2017

Subscribe to Newsletter

A quick tour through different neighborhoods in Brooklyn will take you through decades of construction and building styles. Thanks to the NYC OpenData project, and the Pluto dataset, we can put some numbers on exactly how old the builds are throughout our fair borough. Turns out, overall, Brooklyn clocks in with the oldest original construction date amongst all the boroughs of NYC - with a median construction year of 1931 (normalized to square footage). Manhattan is only slightly newer, at 1938.

Not surprisingly, building age is different throughout Brooklyn. Below we plot out the median building age by census tract, normalized to square footage.

Brooklyn Building Ages

The oldest neighborhoods are Ocean Hill (1911) and a four way tie between North Side-South Side, Cypress Hill, Park Slope, and Styvesant Heights - all with a median building year of 1920. The newest neighborhood is Coney Island (1969), followed by West Brighton (1962), East New York (1961), Canarsie (1960), and Williamsburg (1956). These are using the neighborhoods defined by NYC’s “neighborhood tabulation areas”.

Of course, the median age of buildings does not tell the whole story. When walking around a neighborhood like Park Slope you tend to notice not just that the building are old, but they are all old - there is a general consistency in the building age. Below is a map that attempts to capture that - measuring the diversity of building ages within Brooklyn neighborhoods (high diversity is darker, more consistency in age is lighter):

Brooklyn Building Age Diversity

You can see that Park Slope is only in the middle of the pack. Neighborhoods like Brownsville and East Flatbush have the highest degree of homogenous construction age, whereas neighborhoods like Canarsie, Williamsburg, Greenpoint, and Fort Greene have a lot of variance - some new buildings sprouting up amongst the old.

New construction will most likely change how our neighborhoods look and feel, and 10 years from now these same maps will likely be entirely different. Brooklyn may even loose the title of the oldest borough overall. But I would wager there will still be enclaves of untouched, original construction for years to come - you just may have to venture further out in Brooklyn to find them!

Figure out your own home age

The statistics and figures here come from the Pluto dataset, so adventurous data enthusiasts can download the dataset themselves and search for their building. The dataset contains a wealth of information, including building type, zoning information, square footage, etc.

For an easier option, download this file: BK Addresses.xlsx and search for you address. It is setup with filters at the top which should help you narrow in on your street and house number. The file includes information on who the owner on record is, when it was built, and when it was last updated. Happy house hunting!

Methodology

So the Pluto dataset does not have the best of reputations. And I can see why - there are a lot of outliers and skews in the distributions. But I tried to control for this, and I think my results should still be valid. I did so in a couple of ways - I used medians whenever I could, winsorized the data before taking any aggregate, and aggregated to census tracts instead of looking at individual buildings. There are outliers, but in aggregate they should be filtered out. That said - if someone can explain some systematic bias in the data that would blow all this up, please share!

nyc brooklyn pluto construction opendata

28 Feb 2017

Subscribe to Newsletter

New York City has about 8.4 million people living within the 5 boroughs, and about 5.4 billion square feet of total building space. Sounds like a lot, but when you break this down to areas zoned for residential living, it averages out to about 531 square feet per person in the city as a whole. This is not country living, but still, on a whole, not bad. That said, with a place like NYC, averages often tell a misleading story, and this is certainly true when looking at how much living space people have throughout the city.

Top and Bottom Housing

Manhattan tops the charts, with 4 of the 5 neighborhoods with the highest space per person. At the highest overall is the Upper East Side/Carnegie Hill area, with more than 1,200 square feet per person. This is followed by Midtown (920 sq ft/person), Battery Park City (838 sq ft/person), SoHo/Tribeca (774 sq ft/person), and finally Tody Hill, Staten Island (756 sq ft/person). Brooklyn and Queens have the most compact living spaces, with North Corona bottoming the list at 207 sq ft per person. The other compressed spaces are Sunset Park East (246 sq ft/person), Cypress Hill (252 sq ft/person), Sunset Park West (262 sq ft/person), and Corona (266 sq ft/person).

Take a look at the city at large though:

Living Space in NYC

It does not take much to notice that lower Manhattan dominates in terms of living space. This is counter intuitive at first - it certainly runs counter to the population density map of NYC, where lower Manhattan ranks with some of the higest density in all of NYC. Take a look at the following map though, which shows the average number of people per household throughout the city:

Household Size in NYC

Side by side, the map of residential space and household size almost look like inverse images. It looks like this accounts for most of the differences across the city - apartment sizes stay pretty constant, but the number of people in the apartment varies. In areas with a lot of children, or where parents live with their grown children, the square footage per person drops percipitously. Throughout Manhattan this is surprisingly low (just over 1 person per household), whereas it grows to 3-4 people per household further out in the boroughs. Note, these numbers are according to the Census ACS survey, and I am guessing do not account for roommates. I do not buy that Manhattan averages such a low household size. But even still, it is representative of a larger trend, of increasing household size the further out in the city you go.

As a final note, I will point out that if you consider just the land mass of the United States as a whole, there is roughly 332 thousand square feet per person currently. Makes that NYC 531 square feet average seem like small potatoes. But where else can you find bodegas open at 2am?

Methodology

I used the 2016 v2 Pluto and the Census ACS datasets for this analysis. I generally cleaned the data prior to aggregation by filtering out obvious outliers. Interestingly, this included several cemetaries throughout the city, which creepily enough ended up with around 8 sq ft per person after running the numbers. I could dig into that anomaly more, but to be honest, I am going to leave it at that.

nyc pluto construction opendata