The Geography of Basketball, Part II: Watching the Game in ArcGIS

A guest post by Gregory Brunner

I wasn’t planning on writing another blog on this topic so soon, but a few days ago I was looking into how to turn the gameid field in the NBA data into the actual game date, teams involved, etc., and I stumbled upon this:

For about a day, I thought I found something that I wasn’t supposed to find. Then I Googled NBA Movement API and found this amazing post by Savvas Tjortjoglou on How to Track NBA Player Movements in Python. That same day, I found NBA Player Tracking. All this amazing player movement data is out there for consumption. I had to explore it!

All I really wanted to do was get this data into ArcMap, see if I can replay the data using the time slider, and then make some webmaps. I wanted to go from watching this:

(That’s Russel Westbrook hitting a 3 point shot. You can access that data as json here.)

to exploring ways to replay the data in ArcMap:

(That’s the first 90+ seconds of the first quarter of the game in ArcMap played back in 10 seconds!)

and also maybe play around with different ways to render players and moments on the court.

View larger map

That’s Russel Westbrook hitting a 3 point shot.

So how do you go from the NBA player tracking video to working with the data in ArcMap and ArcGIS Online?

I decided to stay with Russel Westbrook and look at the first game he played in the 2014-2015 season. That game was on October 29, 2014 against the Portland Trailblazers and all of the data we’ll look at here is from that game. Note that the game was played in Portland, not Oklahoma City, as you might infer from my court image.

Picking up on Savvas Tjortjoglou’s post, I read in event data from from the stats API by passing a specific gameid and eventid.

event_url = '' % (eventid, gameid)
response = requests.get(event_url)
home = response.json()["home"]
visitor = response.json()["visitor"]
moments = response.json()["moments"]
gamedate = response.json()["gamedate"]

The variables home and visitor are dictionaries containing information on the players involved in the game. Moments are the events on the court for each player. The gamedate is the date of the contest.

I used the home dictionary and the visitor dictionary to create a team dictionary and a player dictionary so that later I could apply those to my feature class in the form of a player domain and a team subtype.

#Create team dicitonary
team_dict = {}
team_dict[home['teamid']] = home['name']
team_dict[visitor['teamid']] = visitor['name']
team_dict[-1] = 'Basketball'
#Create player dictionary
d = {}
d[-1] = 'Basketball'
for h in home['players']:
    d[h['playerid']] = h['firstname'] + ' ' + h['lastname']
for v in visitor['players']:
    d[v['playerid']] = v['firstname'] + ' ' + v['lastname']

Then, for every moment in the data, I parsed the data into a tuple that I used to populate my game event feature class.

    coords = []
    for moment in moments:
        quarter = moment[0]
        for player in moment[5]:
            player.extend((moments.index(moment), moment[2], moment[3]))
            clock_time = create_timestamp(quarter, date, player[6])
            ct = datetime.datetime.strftime(clock_time, '%Y/%m/%d %H:%M:%S.%f')[:-5]
            player_data = (player[0], player[1], player[2], player[3], player[4], player[5], player[6], player[7], ct)
            coord = ([10*(player[3]-25), 10*(player[2]-5.25)])

Notice that I scaled the coordinates by a factor of 10 and also shifted the x and y positions slightly. I did this to put the movement data in the same coordinates as the shot chart data, which was in units of feet x 10, with the (0,0) point being the center of the hoop. Now that I think about it, I should probably be putting the shot chart data in the coordinate frame of the player movement data as units of feet will make more sense.

Next, I created a feature class.

def create_feature_class(output_gdb, output_feature_class):
    feature_class = os.path.basename(output_feature_class)
    if not arcpy.Exists(output_gdb):
    if not arcpy.Exists(output_feature_class):
        arcpy.CreateFeatureclass_management(output_gdb,feature_class,"POINT","#","DISABLED","DISABLED", "PROJCS['WGS_1984_Web_Mercator_Auxiliary_Sphere',GEOGCS['GCS_WGS_1984',DATUM['D_WGS_1984',SPHEROID['WGS_1984',6378137.0,298.257223563]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]],PROJECTION['Mercator_Auxiliary_Sphere'],PARAMETER['False_Easting',0.0],PARAMETER['False_Northing',0.0],PARAMETER['Central_Meridian',0.0],PARAMETER['Standard_Parallel_1',0.0],PARAMETER['Auxiliary_Sphere_Type',0.0],UNIT['Meter',1.0]]","#","0","0","0")
        arcpy.AddField_management(output_feature_class,"TEAM_ID","LONG", "", "", "")
        arcpy.AddField_management(output_feature_class,"PLAYER_ID","LONG", "", "", "")
        arcpy.AddField_management(output_feature_class,"LOC_X","DOUBLE", "", "", "")
        arcpy.AddField_management(output_feature_class,"LOC_Y","DOUBLE", "", "", "")
        arcpy.AddField_management(output_feature_class,"RADIUS","DOUBLE", "", "", "")
        arcpy.AddField_management(output_feature_class,"MOMENT","LONG", "", "", "")
        arcpy.AddField_management(output_feature_class,"GAME_CLOCK","DOUBLE", "", "", "")
        arcpy.AddField_management(output_feature_class,"SHOT_CLOCK","DOUBLE", "", "", "")
        arcpy.AddField_management(output_feature_class,"TIME", "TEXT", "", "", 30)

Then, I pushed the game data into the feature class.

def populate_feature_class(rowValues, output_feature_class):
    c = arcpy.da.InsertCursor(output_feature_class,fields)
    for row in rowValues:
    del c

I used the game date for YYYY-MM-DD to create a clock time string. In ArcGIS, for the time to make sense, time needs to move forward. So instead of counting down from 720.0 seconds at the beginning of a quarter to 0.0 seconds at the end, I needed to define a scheme where time moved forward. The time scheme I defined is QQ:MM:SS, where QQ is the quarter, MM is the minutes into the quarter, and SS is the seconds. For example, a time of 01:02:30, indicates that we are 2 minutes and 30 seconds into the first quarter. Here’s how I created the timestamp.

def create_timestamp(quarter, gamedate, seconds):
    m,s = divmod(720-seconds, 60)
    ms = round((s-int(s))*100)
    t = datetime.time(int(quarter), int(m), math.floor(s), ms*10000)
    dt = datetime.datetime.combine(gamedate, t)
    return dt

The NBA data is interesting because there are a lot of coded values. Players and teams have numerical IDs. Above, I created a dictionary of players and teams. I used the players to create a “Players” domain on my feature class.

def create_player_domain(gdb, fc, player_dict):
    domName = "Players"
    inField = "PLAYER_ID"
    arcpy.CreateDomain_management(gdb, domName, "Player Names", "TEXT", "CODED")
    for code in player_dict:
        arcpy.AddCodedValueToDomain_management(gdb, domName, code, player_dict1)

    arcpy.AssignDomainToField_management(fc, inField, domName)

And I used "TEAM_ID" as the subtype on my feature class:

def create_team_subtype(gdb, fc, subtype_dict):
    arcpy.SetSubtypeField_management(os.path.join(gdb,fc), "TEAM_ID")
    for code in subtype_dict:
        arcpy.AddSubtype_management(os.path.join(gdb,fc), code, subtype_dict1)

It's pretty neat how we can apply these GIS data management fundamentals to the NBA data!

I also added a line to remove duplicate moments from the feature class because when I started concatenating multiple events, I noticed that there were some duplicates.

arcpy.DeleteIdentical_management("Game_0021400015_Event_1_10", "Shape;TEAM_ID;PLAYER_ID;LOC_X;LOC_Y;RADIUS;MOMENT;GAME_CLOCK;SHOT_CLOCK;TIME", "", "0")

So what does this data look like in ArcGIS?

At about 1 minute and 6 seconds into the game, Wesley Matthews took a 3 point shot (You can watch the footage here at The shot is the first video in the sequence).

Wesley Matthews Footage

What does this look like in ArcMap?

Wesley Matthews Shooting

This is a roughly 1 second interval of data right after Wesley Matthews released the shot.

He missed.

He missed and the players moved in for the rebound. I used this project to make the movie above. I have also shared this map document (MXD) as a map package (MPK) on my github site. Take a look at the map package in ArcGIS if you're interested.

What does a single event look like as a webmap?

View larger map

This is the webmap for Event 346 (Russel Westbrook hitting a 3). Click on a feature to see the attribution. There are over 7,000 features in this single event and this event spanned less than 10 seconds on the game clock!

In my webmap for Event 346, I can isolate Russel Westbrook and the ball using a filter in ArcGIS Online. It looks like this.

View larger map

That's where Russel Westbrook hits a 3. I've applied the heat map effect to the basketball.

I got ambitious when I wanted to make a video and time-enabled webmap, so I concatenated Events 1 - 10 from the same game. Those are the events I used for the video at the beginning. I published the data as a time-enabled webmap. The time-slider won't appear till you View larger map.

View larger map

The map looks like a mess until the slider appears. There are over 30,000 features in this feature service. I'm sharing it as a hosted feature service, which I did not realize until Gavin noticed, but the hosted feature service stripped out the sub-seconds on the TIME field. I don't know why this happened because for the Event 346 webmap, I can see the sub-seconds. This impacts playing back the game moments as does the fact that the default time-slider does not display data in less than 1 second intervals. We would have to create a custom slider or playback capability to accommodate playing back data of this frequency, which we're thinking about doing. We'll write about it after we do it. Still, I'd encourage you to take a look and play around with the time-slider and data yourself. You can identify, query, and render the data in many different ways!

All that being said, this has all been really fun for me! I can't believe all this data is out there. We're really just scratching the surface. If anyone would like to see the code, you can find it at my github page. I have also shared a map package there if you want to see the data in ArcGIS. Definitely let me know if you have any problems, questions, or insights! In addition to exploring how to improve the playback capability, I'm thinking about doing a post that explores some ArcGIS spatial analysis or space-time analysis and visualization on the data. Let me know if you would be interested in reading about that.

This Week in Esri GitHub

esri-arcgis-js-api-githubEsri has opened up a bunch of repositories on GitHub this past week. To summarize:

JS API now available on GitHub and Bower:


ArcREST, a set of python tools to assist working with ArcGIS REST API for AGS/AGOL/WebMap-JSON, has seen a lot of activity:

Going to DevSummit in March? Keep an eye on this repo:

There’s probably other stuff I missed too. Let me know in the comments!

Inline Footnotes WordPress Plugin

icon-128x128When writing in WordPress, sometimes you need a simple, no-frills footnote that allows the user to click to pop open some additional information. Inline Footnotes, a WordPress plugin I just released, lets you do just that.

It only has a few settings to help set the colors – everything else is controlled inline in your Posts or Pages – just use the [footnote]This is my footnote![/footnote] syntax.

If you’d like to try it out, download it now on

The Geography of Basketball: Mapping NBA Shotcharts in ArcGIS

A guest post by Gregory Brunner

There are a lot of great blog posts out there about techniques to get and plot basketball data using the NBA stats API. Two that I highly recommend are Web Scraping 201: Finding the API and How to Create NBA Shot Charts in R. Here, I want to show how you can use Python to push this data into a Geographic Information System, ArcGIS. From there, you can leverage concepts, tools, and applications that are generally reserved for geography and geographers to make some great visuals from NBA player shot chart data.

Accessing the NBA stats API with Python is pretty straightforward. Here I can get whole season of shots for a player given their player ID and the season.

def get_player_season(player_id, season):
    coords = []
    nba_call_url='' % (season,seasontype, player_id)
    for row in data['resultSets'][0]['rowSet']:
        if row[12]=='3PT Field Goal':
        temp=(row[0], row[1], row[2], row[3], row[4], row[5], row[6], row[7],
                row[8], row[9], row[10],row[11], row[12], row[13], row[14],
                row[15], row[16], row[17],row[18], row[19], row[20], three)
        coord = ([row[17],row[18]])

    return master_shots, coords

Rather than plot the data straight into ggplot (R) or matplotlib (Python) or push the data into our own SQL database, we can store the data in a ArcGIS feature class in a geodatabase. Using arcpy, we can create a feature class and add the fields from the data that we want to store.

def create_feature_class(output_gdb, output_feature_class):
    feature_class = os.path.basename(output_feature_class)
    if not arcpy.Exists(output_gdb):
        arcpy.CreateFeatureclass_management(output_gdb,feature_class,"POINT","#","DISABLED","DISABLED", "PROJCS['WGS_1984_Web_Mercator_Auxiliary_Sphere',GEOGCS['GCS_WGS_1984',DATUM['D_WGS_1984',SPHEROID['WGS_1984',6378137.0,298.257223563]],PRIMEM['Greenwich',0.0],UNIT['Degree',0.0174532925199433]],PROJECTION['Mercator_Auxiliary_Sphere'],PARAMETER['False_Easting',0.0],PARAMETER['False_Northing',0.0],PARAMETER['Central_Meridian',0.0],PARAMETER['Standard_Parallel_1',0.0],PARAMETER['Auxiliary_Sphere_Type',0.0],UNIT['Meter',1.0]]","#","0","0","0")
    if not arcpy.Exists(output_feature_class):
        arcpy.AddField_management(output_feature_class,"GRID_TYPE","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"GAME_ID","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"GAME_EVENT_ID","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"PLAYER_ID","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"PLAYER_NAME","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"TEAM_ID","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"TEAM_NAME","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"PERIOD","SHORT", "", "", "")
        arcpy.AddField_management(output_feature_class,"MINUTES_REMAINING","SHORT", "", "", "")
        arcpy.AddField_management(output_feature_class,"SECONDS_REMAINING","SHORT", "", "", "")
        arcpy.AddField_management(output_feature_class,"EVENT_TYPE","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"ACTION_TYPE","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"SHOT_TYPE","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"SHOT_ZONE_BASIC","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"SHOT_ZONE_AREA","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"SHOT_ZONE_RANGE","TEXT", "", "", 100)
        arcpy.AddField_management(output_feature_class,"SHOT_DISTANCE","SHORT", "", "", "")
        arcpy.AddField_management(output_feature_class,"LOC_X","DOUBLE", "", "", "")
        arcpy.AddField_management(output_feature_class,"LOC_Y","DOUBLE", "", "", "")
        arcpy.AddField_management(output_feature_class,"SHOT_ATTEMPTED_FLAG","SHORT", "", "", "")
        arcpy.AddField_management(output_feature_class,"SHOT_MADE_FLAG","SHORT", "", "", "")
        arcpy.AddField_management(output_feature_class,"THREE","SHORT", "", "", "")

Once the data is in Python and I have a feature class created, all that’s left to do is insert the data into the feature class with an arcpy InsertCursor.

def populate_feature_class(rowValues, output_feature_class):
    c = arcpy.da.InsertCursor(output_feature_class,fc_fields)
    for row in rowValues:
    del c

At this point, if we just wanted to work in ArcMap, we’d be done. We don’t want to do that. We want to share this data as a hosted feature service in ArcGIS Online so that we can use the functionality of ArcGIS Online. Here’s what Russel Westbrook’s 2014-2015 season shot chart looks like as a webmap.

View larger map

I got the OKC Thunder court image from Zach Lowe’s NBA Court Design Power Rankings and georeferenced is to the data using:

# Replace a layer/table view name with a path to a dataset (which can be a layer file) or create the layer/table view within the script
# The following inputs are layers or table views: "okc.jpg"
arcpy.Warp_management(in_raster="okc.jpg", source_control_points="'920.296 -92.874';'920.296 -526.033';'514.595 -526.033';'514.595 -92.874'", target_control_points="'250 -40';'-250 -40';'-250 390';'250 390'", out_raster="C:/PROJECTS/R&D/NBA/OKC_Court_Scaled_39.tif", transformation_type="POLYORDER1", resampling_type="BILINEAR")

I am a few pixels off in the y-direction, so I’ll have to play around with this more to get the image to align better with the data. If you want to do this for other basketball court images, this code probably needs to be modified a little bit depending on the dimensions of the image.

Now we can use other ArcGIS Online capabilities to give the shot chart a heat map effect.

View larger map

What’s really cool is that once you have a webmap of the shot chart data, you can ‘create a web application.’ I chose to use the summary viewer to visualize the shot chart data because it dynamically calculates the shooting percentage and also allows the capability to filter on a selected attribute. Here I filter on SHOT_TYPE; however, you could choose to filter on SHOT_DISTANCE, ACTION_TYPE, PERIOD, or another attribute. Click in the search window that says “All” and you can filter the data on Russel Westbrook’s shot type.

View Larger App

This is really just scratching the surface of what you can do with this data in ArcGIS. You can try using the analytics, geoprocessing services, and other app templates once you get the data into ArcGIS Online. If you need help getting started, you can find my source code at my Github page.