Calculating Net and Adjusted Plus/Minus (with R and Excel)

In this article, I describe how to compute Net Plus/Minus and Adjusted Plus/Minus. This requires R and Excel. Normally, I use R only for data analysis (I don’t like Excel because of the “by hand” aspect). Here, if all computations are done with R, then there will be loops at some point, knowing that the files manipulated are pretty big (I am not good enough in programming to avoid loops…). And R doesn’t do well when running loops on big files. That’s the reason why we will use a little bit of Excel.

First things first, we need to get the raw data files. These files can be found on BasketballValue. Go to: “Downloads”, “2007-2008”, and download the two files: “List of each matchup of one unit against another” and “Statistics of all players across all teams played for”. I chose the 2007-08 regular season to describe my method but of course, it can be applied to any season.

Create a new folder to store these files and those we are going to create.

Next, you need to install R software if you don’t already have it on your computer. You can download the software here. I am not going to introduce you to R. You can find many very good tutorials on the internet (this one for instance). R is the best statistical software, so the amount of documentation on it is huge.

The R interface looks like this:

RConsole

Go to the tab “File”, “Change dir”, and select the folder you just created. From there, all you need to do is copy-paste the commands below in the “R console” window.

Finally, you need to install the “xlsx” package. Go to the tab “Packages”, “Install package(s)”, and select your country. In the list displayed, select the “xlsx” package.

We’re now set for the computations.

Step 1: Excel
Open the two files: “matchups20072008reg20081211.txt” and “playerstats20072008reg20081211.txt”, and save both files in csv format.

Anyway, here are the two files:
matchups20072008reg20081211
playerstats20072008reg20081211

Step 2: R
Here, we are going to work on the file “matchups20072008reg20081211”.

We start by loading the csv file in R. When you execute the command below, a box opens and you must select the file “matchups20072008reg20081211.csv” We rename the file “d”. You can have an overview of the file by executing the function “head”:

head(d)

First, we separate the different information contained in the variable “GameID” by creating four new variables: “Date”, “AwayTeam”, “HomeTeam”, and “GameName”. Again, because R doesn’t like loops, this is going to take time (about 30 minutes on my computer).

nrow_d <- nrow(d)
for (i in 1:nrow_d) {
d$Date[i] <- substr(d$GameID[i], start = 1, stop = 8)
d$AwayTeam[i] <- substr(d$GameID[i], start = 9, stop = 11)
d$HomeTeam[i] <- substr(d$GameID[i], start = 12, stop = 14)
d$GameName[i] <- toString(c(d$Date[i], d$AwayTeam[i], d$HomeTeam[i]))
}
d <- d[,-which(names(d) %in% c("GameID"))]
head(d)

Second, we number the different games by creating the variable “GameID”. We must found a total of 1230 games (the number of games in an NBA season). Again, that computation is going to take time.

d$GameID <- rep(c(0), nrow(d))
d$GameID[1] <- 1
for(i in 2:nrow_d)
if(d$GameName[i] == d$GameName[i-1]) {d$GameID[i] <- d$GameID[i-1]} else {d$GameID[i] <- d$GameID[i-1]+1}
head(d)

Third, we compute the per-possession Plus/Minus for each game segment. This is basically the point differential when controlling for the number of possessions:

(PointsScoredHome/PossessionsHome) − (PointsScoredAway/PossessionsAway)

(for convenience, the value is multiplied by 100)

If a team had no possession during a given game segment, then we substitute the league average value. For instance, the league average per-possession Plus/Minus for home teams is the sum of PointsScoredHome (125002) divided by the sum of PossessionsHome (114749). We name the relevant variable “PM_perposs”:

d$No_PossessionsHome <- ifelse(d$PossessionsHome == 0,1,0)
d$No_PossessionsAway <- ifelse(d$PossessionsAway == 0,1,0)
d$PM_perposs <- ifelse(d$PossessionsHome == 0 & d$PossessionsAway != 0, 100*((sum(d$PointsScoredHome)/sum(d$PossessionsHome))-(d$PointsScoredAway/d$PossessionsAway)), ifelse(d$PossessionsHome != 0 & d$PossessionsAway == 0, 100*((d$PointsScoredHome/d$PossessionsHome)-(sum(d$PointsScoredAway)/sum(d$PossessionsAway))), ifelse(d$PossessionsHome == 0 & d$PossessionsAway == 0, 100*((sum(d$PointsScoredHome)/sum(d$PossessionsHome))-(sum(d$PointsScoredAway)/sum(d$PossessionsAway))), 100*((d$PointsScoredHome/d$PossessionsHome)-(d$PointsScoredAway/d$PossessionsAway)))))

Fourth, we compute the number of possessions for each game segment (“Poss” variable):

d$Poss <- d$PossessionsHome + d$PossessionsAway

Finally, we create two variables for the next computations:

d$HomePlayers <- sprintf("%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s", "P", d$HomePlayer1ID, ",", "P", d$HomePlayer2ID, ",", "P", d$HomePlayer3ID, ",", "P", d$HomePlayer4ID, ",", "P", d$HomePlayer5ID, ",") 
d$AwayPlayers <- sprintf("%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s", "P", d$AwayPlayer1ID, ",", "P", d$AwayPlayer2ID, ",", "P", d$AwayPlayer3ID, ",", "P", d$AwayPlayer4ID, ",", "P", d$AwayPlayer5ID, ",")

The file “d” is now set, it must have 57 columns:

head(d)
ncol_<- ncol(d) 
ncol_d

Let’s save the file in Excel format. The relevant function in R is “write.xlsx”. For some reasons that I don’t really understand, the file must be save as .xls rather than .xlsx. This will take a couple of minutes. When it’s done, the file created is located in the file directory.

library(xlsx)
write.xlsx(x = d, file = "d.xls", sheetName = "Sheet1", col.names = TRUE, row.names = FALSE)

Here is the file: d

Step 3: R
Now we are going to work on the file “playerstats20072008reg20081211”. This file contains the information on the 467 players in the league during the 2007-08 season. We load the file in R with the following command: a box opens and you must select the file “playerstats20072008reg20081211.csv”. We rename it “adj0708_players_stats”:

adj0708_players_stats <- read.csv2 (file.choose())

First, we delete players who did not play enough minutes during the season. For the 2007-08 season, BasketballValue reports a cutoff of 388 minutes which roughly corresponds to the 25th percentile. This cutoff leaves 339 players. In order to select them, execute this:

adj0708_players_stats$AdjustedPM <- as.character(adj0708_players_stats$AdjustedPM)
adj0708_players_stats <- adj0708_players_stats[(adj0708_players_stats$AdjustedPM != "NULL"),]
head(adj0708_players_stats)
nrow(adj0708_players_stats)

Next, we create the variable “PID” which is basically the same variable as “PlayerID” but with “P” preceding the number.

adj0708_players_stats$PID <- paste("P", as.character(adj0708_players_stats$PlayerID), sep = "")
head(adj0708_players_stats)

Third, we create the variable “team1” which is relevant for the players who played in more than one team during the season. For those players, this variable indicates only the first team for which they played during the season. Actually, I do this in order to simplify the calculation of Net Plus/Minus, but note that it is not done in the official results.

adj0708_players_stats$PlayerTeams <- as.character(adj0708_players_stats$PlayerTeams)
adj0708_players_stats$SimpleMin <- as.numeric(as.character(adj0708_players_stats$SimpleMin))
adj0708_players_stats$team1 <- substr(adj0708_players_stats$PlayerTeams, start = 1, stop = 3)
head(adj0708_players_stats)

Finally, let’s create a small file containing the ID and the team of the 339 players. We simply name this file “players”:

players <- data.frame(adj0708_players_stats$PID, adj0708_players_stats$team1)
names(players) <- c("PID", "team1")
players

Then, export the file to an Excel Spreadsheet:

write.xlsx(x = players, file = "players.xlsx", sheetName = "Sheet1", col.names = TRUE, row.names = FALSE)

Here is the file: players

Step 4: Excel
In this step, we are going to create different Excel files, each file corresponding to a particular component of Net Plus/Minus. The process will be the same for all files.

Open the two files “d.xlsx” and “players.xlsx”.

Firstly, let’s create a file that codes whether a player was on court (1) or not (0) when his team was playing at home.

1. Open a new Excel Spreadsheet and copy-paste the “d” file.
2. Go to the row #1-column #58 cell and copy-paste the column “PID” of the “players” file (from row #2 to row #340). Importantly, you must paste the content as transposed (from column to row).
3. Go to the row #2-column #58 cell and fill it with the following formula:

=SI(ESTNUM(TROUVE(BF$1&",";$BD2));1;0)

Copy-paste this formula from the row #2-column #58 cell to the row #35459-column #396 cell.

4. Save the file as “adj0708_players_on_home.csv”.
Here is the file: adj0708_players_on_home

Secondly, let’s create a file that codes whether a player was on court (1) or not (0) when his team was playing away.

1. Open a new Excel Spreadsheet and copy-paste the “d” file.
2. Go to the row #1-column #58 cell and copy-paste as transposed the column “PID” of the “players” file (from row #2 to row #340).
3. Go to the row #2-column #58 cell and fill it with the following formula:

=SI(ESTNUM(TROUVE(BF$1&",";$BE2));1;0)

Copy-paste this formula from the row #2-column #58 cell to the row #35459-column #396 cell.

4. Save the file as “adj0708_players_on_away.csv”.
Here is the file: adj0708_players_on_away

Thirdly, let’s create a file that codes whether a player was off court (1) or not (0) when his team was playing at home.

1. Open a new Excel Spreadsheet and copy-paste the “d” file.
2. Go to the top of the file and insert a row.
3. Go to the row #1-column #58 cell and copy-paste as transposed the column “team1” of the “players” file (from row #2 to row #340).
4. Go to the row #2-column #58 cell and copy-paste as transposed the column “PID” of the “players” file (from row #2 to row #340).
5. Go to the row #3-column #58 cell and fill it with the following formula:

=SI(ET(ESTNUM(TROUVE(BF$2&",";$BD3))=FAUX;ESTNUM(TROUVE(BF$2&",";$BE3))=FAUX;BF$1=$AW3);1;0)

Copy-paste this formula from the row #3-column #58 cell to the row #35459-column #396 cell.

6. Delete row #1.
7. Save the file as “adj0708_players_off_home.csv”.
Here is the file: adj0708_player_off_home

Fourthly, let’s create a file that codes whether a player was off court (1) or not (0) when his team was playing away.

1. Open a new Excel Spreadsheet and copy-paste the “d” file.
2. Go to the top of the file and insert a row.
3. Go to the row #1-column #58 cell and copy-paste as transposed the column “team1” of the “players” file (from row #2 to row #340).
4. Go to the row #2-column #58 cell and copy-paste as transposed the column “PID” of the “players” file (from row #2 to row #340).
5. Go to the row #3-column #58 cell and fill it with the following formula:

=SI(ET(ESTNUM(TROUVE(BF$2&",";$BD3))=FAUX;ESTNUM(TROUVE(BF$2&",";$BE3))=FAUX;BF$1=$AV3);1;0)

Copy-paste this formula from the row #3-column #58 cell to the row #35459-column #396 cell.

6. Delete row #1.
7. Save the file as “adj0708_players_off_away.csv”.
Here is the file: adj0708_player_off_away

Finally, let’s create a file that codes whether a player’s team was playing (1) or not (0). This file will allow us to check the four previous files.

1. Open a new Excel Spreadsheet and copy-paste the “d” file.
2. Go to the row #1-column #58 cell and copy-paste as transposed the column “team1” of the “players” file (from row #2 to row #340).
3. Go to the row #2-column #58 cell and fill it with the following formula:

=SI(OU(BF$1=$AV2;BF$1=$AW2);1;0)

Copy-paste this formula from the row #2-column #58 cell to the row #35459-column #396 cell.

4. Save the file as “adj0708_players_teams.csv”.
5. Open this new file, go to the row #1-column #58 cell and copy-paste as transposed the column “PID” of the “players” file (from row #2 to row #340). This will replace the previous content. Save the file. Here is the file: adj0708_players_teams

Step 5: R
Now we are going to check that the four files we created are correct. Let’s start by loading the csv files in R. Each time, you just need to select the corresponding csv file.

adj0708_players_on_home <- read.csv2 (file.choose())
head(adj0708_players_on_home)
adj0708_players_on_away <- read.csv2 (file.choose())
head(adj0708_players_on_away)
adj0708_players_off_home <- read.csv2 (file.choose())
head(adj0708_players_off_home)
adj0708_players_off_away <- read.csv2 (file.choose())
head(adj0708_players_off_away)
adj0708_players_teams <- read.csv2 (file.choose())
head(adj0708_players_teams)

Now execute this:

ncol_total <- ncol(adj0708_players_teams)
ncol_total

v1 <- data.frame(colSums(adj0708_players_teams[(ncol_d+1):ncol_total]))
names(v1) <- c("players_teams")
v1

v2 <- data.frame(colSums(adj0708_players_on_home[(ncol_d+1):ncol_total]))
names(v2) <- c("players_on_home")
v2

v3 <- data.frame(colSums(adj0708_players_on_away[(ncol_d+1):ncol_total]))
names(v3) <- c("players_on_away")
v3

v4 <- data.frame(colSums(adj0708_players_off_home[(ncol_d+1):ncol_total]))
names(v4) <- c("players_off_home")
v4

v5 <- data.frame(colSums(adj0708_players_off_away[(ncol_d+1):ncol_total]))
names(v5) <- c("players_off_away")
v5

v <- data.frame(adj0708_players_stats$PlayerTeams,v2,v3,v4,v5,v1)
v

v$x <- v$players_on_home + v$players_on_away + v$players_off_home + v$players_off_away
v

In the “v” data frame, the “players_teams” variable shows the number of game segments of a given team. For instance, Dallas had 2659 game segments during the 2007-08 season and Phoenix had 2235 game segments. The “players_on_home” variable shows the number of game segments in which a given player was on court while his team was playing at home. For instance, there are 907 game segments in which Player #1 (Josh Howard) was on court while his team (Dallas) was playing at home. Similarly, “players_off_home” variable shows the number of game segments in which a given player was off court while his team was playing at home. There are 419 game segments in which Player #1 was off court while his team was playing at home.

For a given player, the sum number of game segments on_home + number of game segments on_away + number of game segments off_home + number of game segments off_away must be equal to the number of game segments of his team. This is how we can check our computations.

If you look at the last two variables of the “v” data.frame, you can see that there is a match for players who played for one team during the season but not for players who played for several teams. This is because for those players, we took into account only the first team for which they played during the season (cf. “team1” variable). Again, note that we did this only for the sake of computational simplicity.

Step 6: R
Now we can compute Net Plus/Minus.

First, let’s compute “net on_home”: the mean of the per-possession Plus/Minus of all game segments in which a given player was on court while this team was playing at home.

adj0708_players_stats$net_on_home <- rep(0, nrow(adj0708_players_stats))

for (i in 1:nrow(adj0708_players_stats))
adj0708_players_stats$net_on_home[i] <- mean(adj0708_players_on_home[adj0708_players_on_home[,(ncol_d+i)] == 1,]$PM_perposs) 
head(adj0708_players_stats)

Second, let’s compute “net on_away”: the mean of the per-possession Plus/Minus of all game segments in which a given player was on court while this team was playing away.

adj0708_players_stats$net_on_away <- rep(0, nrow(adj0708_players_stats))

for (i in 1:nrow(adj0708_players_stats))
adj0708_players_stats$net_on_away[i] <- (-1)*(mean(adj0708_players_on_away[adj0708_players_on_away[,(ncol_d+i)] == 1,]$PM_perposs)) 
head(adj0708_players_stats)

Third, let’s compute “net off_home”: the mean of the per-possession Plus/Minus of all game segments in which a given player was off court while this team was playing at home.

adj0708_players_stats$net_off_home <- rep(0, nrow(adj0708_players_stats))

for (i in 1:nrow(adj0708_players_stats))
adj0708_players_stats$net_off_home[i] <- mean(adj0708_players_off_home[adj0708_players_off_home[,(ncol_d+i)] == 1,]$PM_perposs) 
head(adj0708_players_stats)

Fourth, let’s compute “net off_away”: the mean of the per-possession Plus/Minus of all game segments in which a given player was off court while this team was playing away.

adj0708_players_stats$net_off_away <- rep(0, nrow(adj0708_players_stats))

for (i in 1:nrow(adj0708_players_stats))
adj0708_players_stats$net_off_away[i] <- (-1)*(mean(adj0708_players_off_away[adj0708_players_off_away[,(ncol_d+i)] == 1,]$PM_perposs)) 
head(adj0708_players_stats)

“net_on” is the mean of “net on_home” and “net on_away” while “net_off” is the mean of “net off_home” and “net off_away”. “net_PM” is the raw difference between “net_on” and “net_off”:

adj0708_players_stats$net_on <- round((adj0708_players_stats$net_on_home + adj0708_players_stats$net_on_away)/2,2)
adj0708_players_stats$net_off <- round((adj0708_players_stats$net_off_home + adj0708_players_stats$net_off_away)/2,2)
adj0708_players_stats$net_PM <- (adj0708_players_stats$net_on - adj0708_players_stats$net_off)

head(adj0708_players_stats)

Step 7: R
We have Net Plus/Minus, now let’s compute Adjusted Plus/Minus. As you may know, it is computed by running a regression.

First, let’s create the file:

1. Open a new Excel Spreadsheet and copy-paste the “d” file.
2. Go to the row #1-column #58 cell and copy-paste as transposed the column “PID” of the “players” file (from row #2 to row #340).
3. Go to the row #2-column #58 cell and fill it with the following formula:

=SI(ESTNUM(TROUVE(BF$1&",";$BD2));1;SI(ESTNUM(TROUVE(BF$1&",";$BE2));-1;0))

Copy-paste this formula from the row #2-column #58 cell to the row #35459-column #396 cell.

4. Save the file as “adj0708.csv”. Here is the file: adj0708

Load the file in R:

adj0708 <- read.csv2 (file.choose())
head(adj0708)

Execute this to put the “PM_perposs” variable into the right format:

adj0708$PM_perposs <- as.numeric(as.character(adj0708$PM_perposs))

Here is the command to perform the regression:

fit.adj0708 <- lm(adj0708$PM_perposs ~ adj0708$P1 + adj0708$P2 + adj0708$P3 + adj0708$P4 + adj0708$P5 + adj0708$P6 + adj0708$P7 + 
adj0708$P8 + adj0708$P10 + adj0708$P12 + adj0708$P14 + adj0708$P15 + adj0708$P16 + adj0708$P17 + 
adj0708$P19 + adj0708$P20 + adj0708$P21 + adj0708$P22 + adj0708$P23 + adj0708$P24 + adj0708$P25 + 
adj0708$P26 + adj0708$P28 + adj0708$P29 + adj0708$P30 + adj0708$P31 + adj0708$P32 + adj0708$P33 + 
adj0708$P35 + adj0708$P36 + adj0708$P37 + adj0708$P39 + adj0708$P40 + adj0708$P42 + adj0708$P45 + 
adj0708$P46 + adj0708$P47 + adj0708$P48 + adj0708$P49 + adj0708$P52 + adj0708$P54 + adj0708$P55 + 
adj0708$P56 + adj0708$P59 + adj0708$P60 + adj0708$P62 + adj0708$P63 + adj0708$P65 + adj0708$P67 + 
adj0708$P68 + adj0708$P69 + adj0708$P70 + adj0708$P71 + adj0708$P75 + adj0708$P76 + adj0708$P77 + 
adj0708$P78 + adj0708$P79 + adj0708$P80 + adj0708$P83 + adj0708$P85 + adj0708$P86 + adj0708$P87 + 
adj0708$P88 + adj0708$P89 + adj0708$P90 + adj0708$P91 + adj0708$P92 + adj0708$P93 + adj0708$P94 + 
adj0708$P95 + adj0708$P96 + adj0708$P97 + adj0708$P98 + adj0708$P100 + adj0708$P101 + adj0708$P103 + 
adj0708$P109 + adj0708$P110 + adj0708$P111 + adj0708$P112 + adj0708$P113 + adj0708$P114 + adj0708$P115 + 
adj0708$P116 + adj0708$P117 + adj0708$P118 + adj0708$P119 + adj0708$P121 + adj0708$P122 + adj0708$P123 + 
adj0708$P125 + adj0708$P126 + adj0708$P128 + adj0708$P130 + adj0708$P132 + adj0708$P133 + adj0708$P136 + 
adj0708$P137 + adj0708$P138 + adj0708$P139 + adj0708$P140 + adj0708$P141 + adj0708$P143 + adj0708$P145 + 
adj0708$P147 + adj0708$P148 + adj0708$P149 + adj0708$P150 + adj0708$P151 + adj0708$P152 + adj0708$P154 + 
adj0708$P155 + adj0708$P158 + adj0708$P159 + adj0708$P163 + adj0708$P164 + adj0708$P165 + adj0708$P166 + 
adj0708$P168 + adj0708$P169 + adj0708$P170 + adj0708$P171 + adj0708$P172 + adj0708$P173 + adj0708$P174 + 
adj0708$P176 + adj0708$P180 + adj0708$P181 + adj0708$P185 + adj0708$P186 + adj0708$P187 + adj0708$P189 + 
adj0708$P190 + adj0708$P191 + adj0708$P193 + adj0708$P194 + adj0708$P196 + adj0708$P197 + adj0708$P198 + 
adj0708$P200 + adj0708$P201 + adj0708$P202 + adj0708$P203 + adj0708$P204 + adj0708$P205 + adj0708$P206 + 
adj0708$P209 + adj0708$P210 + adj0708$P211 + adj0708$P214 + adj0708$P215 + adj0708$P218 + adj0708$P220 + 
adj0708$P221 + adj0708$P222 + adj0708$P223 + adj0708$P224 + adj0708$P226 + adj0708$P231 + adj0708$P235 + 
adj0708$P236 + adj0708$P237 + adj0708$P239 + adj0708$P240 + adj0708$P241 + adj0708$P243 + adj0708$P244 + 
adj0708$P245 + adj0708$P246 + adj0708$P247 + adj0708$P249 + adj0708$P252 + adj0708$P253 + adj0708$P254 + 
adj0708$P255 + adj0708$P257 + adj0708$P258 + adj0708$P260 + adj0708$P261 + adj0708$P264 + adj0708$P265 + 
adj0708$P270 + adj0708$P271 + adj0708$P272 + adj0708$P273 + adj0708$P274 + adj0708$P275 + adj0708$P276 + 
adj0708$P277 + adj0708$P278 + adj0708$P279 + adj0708$P280 + adj0708$P281 + adj0708$P283 + adj0708$P284 + 
adj0708$P285 + adj0708$P287 + adj0708$P288 + adj0708$P289 + adj0708$P292 + adj0708$P293 + adj0708$P295 + 
adj0708$P296 + adj0708$P297 + adj0708$P298 + adj0708$P302 + adj0708$P305 + adj0708$P306 + adj0708$P307 + 
adj0708$P308 + adj0708$P311 + adj0708$P312 + adj0708$P313 + adj0708$P315 + adj0708$P316 + adj0708$P318 + 
adj0708$P319 + adj0708$P321 + adj0708$P322 + adj0708$P324 + adj0708$P326 + adj0708$P327 + adj0708$P328 + 
adj0708$P329 + adj0708$P330 + adj0708$P331 + adj0708$P337 + adj0708$P344 + adj0708$P354 + adj0708$P355 + 
adj0708$P356 + adj0708$P357 + adj0708$P362 + adj0708$P366 + adj0708$P367 + adj0708$P372 + adj0708$P377 + 
adj0708$P382 + adj0708$P383 + adj0708$P389 + adj0708$P390 + adj0708$P401 + adj0708$P411 + adj0708$P413 + 
adj0708$P416 + adj0708$P417 + adj0708$P419 + adj0708$P426 + adj0708$P430 + adj0708$P444 + adj0708$P458 + 
adj0708$P459 + adj0708$P468 + adj0708$P470 + adj0708$P484 + adj0708$P486 + adj0708$P488 + adj0708$P489 + 
adj0708$P492 + adj0708$P501 + adj0708$P505 + adj0708$P508 + adj0708$P510 + adj0708$P516 + adj0708$P517 + 
adj0708$P518 + adj0708$P537 + adj0708$P541 + adj0708$P544 + adj0708$P551 + adj0708$P552 + adj0708$P553 + 
adj0708$P557 + adj0708$P558 + adj0708$P559 + adj0708$P561 + adj0708$P563 + adj0708$P565 + adj0708$P573 + 
adj0708$P575 + adj0708$P577 + adj0708$P578 + adj0708$P579 + adj0708$P581 + adj0708$P588 + adj0708$P589 + 
adj0708$P592 + adj0708$P593 + adj0708$P594 + adj0708$P597 + adj0708$P598 + adj0708$P601 + adj0708$P602 + 
adj0708$P605 + adj0708$P606 + adj0708$P607 + adj0708$P608 + adj0708$P609 + adj0708$P612 + adj0708$P614 + 
adj0708$P617 + adj0708$P622 + adj0708$P629 + adj0708$P640 + adj0708$P643 + adj0708$P665 + adj0708$P666 + 
adj0708$P667 + adj0708$P674 + adj0708$P675 + adj0708$P678 + adj0708$P679 + adj0708$P681 + adj0708$P682 + 
adj0708$P685 + adj0708$P686 + adj0708$P689 + adj0708$P691 + adj0708$P693 + adj0708$P694 + adj0708$P695 + 
adj0708$P696 + adj0708$P699 + adj0708$P700 + adj0708$P701 + adj0708$P703 + adj0708$P705 + adj0708$P707 + 
adj0708$P708 + adj0708$P709 + adj0708$P711 + adj0708$P712 + adj0708$P713 + adj0708$P714 + adj0708$P715 + 
adj0708$P716 + adj0708$P719 + adj0708$P722, weights=adj0708$Poss)

Let’s see the results of the regression:

reg <- summary(fit.adj0708)
reg

Basically, Adjusted Plus/Minus corresponds to “Estimate” (the regression coefficients). Let’s add the results to the “adj0708_players_stats” file which already contains Net Plus/Minus:

X1 <- data.frame(reg$coefficients)
adj_PM <- X1[-c(1),]
names(adj_PM) <- c("estimate","std.Error","t.value","p.value")
adj_PM

adj0708_players_stats$adj_PM <- round(adj_PM$estimate,2)
head(adj0708_players_stats)

Now we have Net Plus/Minus and Adjusted Plus/Minus in the same file.

Step 8: R and Excel
This is the last step… We are simply going to format the results.

First, let’s delete the players who played for several teams during the season. That represents 46 players out of the 339. We do this because the way we computed their Net Plus/Minus is not entirely correct (see above). But once again, note that we voluntarily simplified this computation only in order to keep this tutorial quite simple.

adj0708_players_stats$selec <- rep(0, nrow(adj0708_players_stats))

for (i in 1:nrow(adj0708_players_stats))
if (length(strsplit(adj0708_players_stats[i,]$PlayerTeams, " ")[[1]]) > 1) {adj0708_players_stats$selec[i] <- 1} else {adj0708_players_stats$selec[i] <- 0}
head(adj0708_players_stats)

sum(adj0708_players_stats$selec) ##number of players deleted

adj0708_players_stats_OK <- adj0708_players_stats[adj0708_players_stats$selec == 0,]
head(adj0708_players_stats_OK)

Second, let’s sort the players by their Net Plus/Minus:

adj0708_players_stats_net_PM_sorted <- adj0708_players_stats_OK[order(-adj0708_players_stats_OK$net_PM),]
head(adj0708_players_stats_net_PM_sorted)

Third, let’s put the relevant results into a final frame:

results <- data.frame(adj0708_players_stats_net_PM_sorted$PlayerID, adj0708_players_stats_net_PM_sorted$PlayerTrueName, adj0708_players_stats_net_PM_sorted$PlayerTeams, round(adj0708_players_stats_net_PM_sorted$SimpleMin,2), adj0708_players_stats_net_PM_sorted$net_on, adj0708_players_stats_net_PM_sorted$net_off, adj0708_players_stats_net_PM_sorted$net_PM, adj0708_players_stats_net_PM_sorted$adj_PM)

names(results) <- c("PlayerID", "PlayerTrueName", "PlayerTeams", "Min", "net_on", "net_off", "net_PM", "adj_PM")
results

and export the file to an Excel Spreadsheet:

write.xlsx(x = results, file = "results_check.xlsx", sheetName = "Sheet1", col.names = TRUE, row.names = FALSE)

That’s it! We’re done!

Here are the results:

NPM_APM tab results
Note that these results (both Net Plus/Minus and Adjusted Plus/Minus) are a bit different from those reported on BasketballValue. Regarding Net Plus/Minus, I know the reasons of the discrepancy, it’s not really important. The calculations described here are valid.

Share on FacebookTweet about this on TwitterEmail this to someone