The Great American Migration Story You Don’t Often Hear

When people talk about migration in the context of the United States, they’re usually talking about international migration, people coming to the US from other countries.

I find the interstate migration story more interesting. Several reasons why:

  • It matters for economic growth

This article is an attempt at studying the trends of US interstate migration in recent years. For this, I look at a very interesting dataset from the US IRS of households tax filings between two consecutive years. The change in address between the two years helps classify whether they’ve migrated or not. It’s not perfect, but it’s a great place to start. Here are the questions I tried to answer through this dataset.

Where in the US are most migrants coming from and where are they moving to?

To answer this question, I looked at the cumulative inflows and outflows between the years 2012–2018 for every state in the US and divided the two to get a net migration ratio. A ratio <1 means more people moved out than moved in and >1 means the opposite. I further divided the states into 5 key regions. Below is what I found.

Plot of net inflow: outflow ratio by geographic region in the US

From the image above, we observe some very interesting trends. The northeastern US has been experiencing a lot of net outflow over the last few years along with the midwest and the pacific region. Where are they all going? Mostly to the midwestern states followed by the southern states. Most of us must have read some news indicating something like this but the magnitudes are just striking!

We often read in the papers or hear on the news that old people are moving to states like Florida. We also hear that wealthy people are moving out of states like New York or California because taxes are too high. The question is, is that really true and if so, can we observe these trends in the data? That leads us to the second question.

How do the migratory patterns depend on income and age? Which age groups or income groups are more likely to move?

Below is a chart that looks exactly at that.

Inflow: Outflow ratio by age, income and region

We divide the age into 26 to 54-year-olds who tend to mostly work full-time or want to work full-time and 55+year olds who are at or close to retirement. The above figure confirms the finding that the northeast and midwest have been experiencing net outflows. For both these regions, outflows have been increasing with higher income and are elevated for both working people and over retirees. High-income earners are largely flocking to the mountain states followed by the southern states. The pacific states are seeing a sharp outflow of high income 55+ year-olds. Now that we know what regions are experiencing sharp outflows and inflows, what do we know about which states contribute the most?

Which states contribute the most to migratory flows and how have they evolved over time?

The charts below show the breakdown for each region by state and year. As before, a number >1 means more people are coming in than leaving and a number <1 means more people are leaving than coming in.

Inflow: Outflow ratio by time, region and state

Right away, we can see that New York, Illinois and Alaska have been experiencing very high net outflows every year. In the South, Mississippi used to the state with the worst net outflows but in recent years it has been Lousiana. In the mountain region, It was New Mexico before, but in recent years it has been Wyoming.

In the northeast, Maine has been experiencing healthy net inflows in recent years. In the south, Texas, Florida and South Carolina have been experiencing strong net inflows of migrants over the years. In the Midwest, the Dakotas have experienced good net inflows with the most recent year migrant inflow going to Minnesota and Indiana. Oregon has been a growing target of inflows in the pacific. In the mountain states, Colorado used to attract the most net inflows, but in recent years Idaho has been doing much better.

With all the trends we are seeing, can we actually predict what next year’s inflow would look like? That leads us to the final question.

Can we predict what the net inflow-outflow ratio would be in a given year?

I built a machine learning model to help answer this question, the details of which can be found in the Github link at the end of this post. But some of my key findings are:

a) 85% of last year’s net inflow-outflow ratio continues into the next year.

b) Depending on what region the population belongs to, there are additional inflows with the mountain region having the highest.

c) Age does not have a significant impact on net inflows but the year-on-year increase of incomes do, with a positive impact from same state migrant income increases and a negative impact from income increases among out-of-state migrant inflows.

While this data has been very helpful in answering questions on the ‘who’ of interstate migrants in the United States, it still does not address the ‘why’. What would be a fascinating next step for this analysis is to analyze what is driving this immigration — is it higher-income jobs, cost of living or some other reason? Data on house prices or rentals, income/wealth tax rates and wages can be good proxies to understand the reason behind some of these massive migrant flows.

More details and link to the data and analysis can be found here.