An Econometric Analysis of Home Field Advantage in the World’s Largest Sports: Pre and Post COVID-19 Outbreak

: The purpose of this study is to measure the home field advantage across sports and within sports. The study also sets forth an analysis of the soccer games played in empty stadiums during the COVID-19 pandemic. We measure whether there is any difference between playing in front of fans and without fans: Something that has not been captured before. We use data for the years 2016-2019 from the top five UEFA soccer leagues, the NBA and the NFL. We find that home field advantage is present in soccer, however, there is no significant difference between its top five leagues or over the last four years. We also find that for both the NBA and the NFL, there is no home advantage, however, there is a difference between each of their respective divisions. Finally, we show that playing in front of no fans for the UEFA leagues because of COVID-19 causes home field advantage to dissipate.


Introduction
Athletic spectators are amongst the most historic and spirited people all over the world.They have undying passions for their clubs.For instance, every four years, around three million people travel to watch the Fédération Internationale de Football Association (FIFA) World Cup in attendance.Broadcasting data from FIFA.com (2018) shows that 3.572 billion people, over half of the global population, watched the recordbreaking 2018 World Cup in Russia in total.Clearly, these expansive audiences must have some impact on how teams play.Players may have different responses to large audiences such that, for example, they may play better in a noisy stadium or buckle under the pressure of tens of thousands of spectators jeering them on.The effect of home field advantage and large groups of fans may give the sporting world some insight as to how their respective sports are influenced by forces outside of the playing field.
In the model set forth by the experiment, we integrate variables such as team salaries, referee behavior and in-game statistics that have not been tested against each other before.This study not only measures the home field advantage across sports, but also within the different divisional bodies of each sport.Therefore, our analysis will portray any innate differences in home field advantage between divisions or even individual teams of a particular sport.Our research will incorporate similar statistical testing of the European soccer games played under the COVID-19 fan-free regulations.In doing so, our models determine whether there exists any statistically significant difference between playing in front of fans and without fans.In order to make such a comparison, we first come up with our own definition of home field advantage using our own variables and regression analysis.Once we have found a quantitative estimation of this measurement, we can run the same test for the European soccer games played under their respective COVID-19 regulations and compare both measurements of home field advantage.
The main objectives of this study are: (i) Finding the effect of home field advantage on the outcome of a game in different sports (ii) Identifying differences in home field advantage within individual sporting leagues as opposed to across different sports (iii) Calculating any change in home field advantage that arises from the COVID-19 outbreak regulations in sports Literature Review Chatterjee et al. (1994) uses both linear and logistic regressions to calculate which elements of basketball have the largest impacts on team performance in the National Basketball Association (NBA).The win-loss percentage of any given team represents their dependent variable.Accounting for more than 90% of the variation in their data, field goals, free throws, rebounds and turnovers are some of the independent variables that Chatterjee et al. (1994) finds to be statistically significant.Carmichael et al. (2000) attempts to capture the performances of teams in the English Premier League (EPL) by replicating a production function used to measure team performance in Major League Baseball (MLB).They include variables such as goal differential and shot differential to measure a given team's success rate.While both of these studies highlight some of the more important elements of basketball and soccer, they fail to account for a crucial element in team performance: Home field advantage.In the research performed by Waters and Lovell (2002) and Trandel and Maxcy (2011), home advantage is proposed as a potential source of influence on many sporting competitions around the world.Waters and Lovell (2002) discusses some of the characteristics of home field advantage in English soccer in regard to players' retrospective perceptions of confidence and psychology.They claim that playing at home allows players to feel more confident when they are winning a game.However, when the home team is losing, these players feel significantly less confident than if they were to be playing on the road.They also find that football players feel as though "they are not expected to win" when they attend an away game and, thus, feel less pressure.The study concludes that players feel more positive at home, more confident at home than on the road and more anxious on the road.Trandel and Maxcy (2011) cites that sporting analysts have measured the competitive balance between sporting leagues by comparing the standard deviations of winning percentages of all the teams in a given league and comparing them to that of other leagues.One issue with this method is that leagues have differing numbers of games in their seasons.For example, the MLB plays a 162-game season whereas the National Football League (NFL) plays only 16.It would be inaccurate to compare the winning-percentage standard deviations of these teams given that, in a short season, each individual game is worth significantly more than it would be in a long season.Thus, any individual result from a short season has a greater impact on a team's record than it would in a long season.Trandel and Maxcy (2011) fixes this problem by dividing the standard deviation of team winning percentages by the standard deviation of the expected outcome of each game if each team were equally likely to win.However, it is never the case in any professional sport that each team has the same chance of winning.In fact, Trandel and Maxcy (2011) shows that the home team tends to win more than 50% of time on average.This contributes to the definition of home field advantage.When this effect is not taken into account, the ratio of standard deviations is smaller than it should be.This system of calculations overestimates the competitive balance of a particular league or sport.Trandel and Maxcy (2011) attempts to adjust for these biases by creating a formula that calculates a "home-advantage-corrected ideal standard deviation."They find that the NBA has the largest home field advantage among North American sports.In correcting for this bias, they show how US sports are competitively imbalanced.They also note that, in doing this adjustment, the competitive difference between the NFL and MLB is modestly decreased.Given that both Waters and Lovell (2002) and Trandel and Maxcy (2011) suggest some potential differences between playing in home and away games as mentioned above, we wish to replicate and expand upon the research made in those articles to measure the performances of sports teams when considering the advantage that teams have when playing at home.This has been performed and quantified in a wide range of literature considering the home field advantage of individual sports, (Clarke and Norman, 1995;Nevill et al., 1996;Omotayo, 2003;Pollard et al., 2008;García et al., 2013;Pollard and Gómez, 2013;Kotecki, 2014;Van Damme and Baert, 2019) and even across multiple sports (Trandel and Maxcy, 2011;Pollard and Gómez, 2014;Pollard et al., 2017).Given the extensive amount of research pertaining to the subject, we knew that an experiment involving home field advantage was easily replicable.While the majority of home field advantage research pertains to soccer in specific, we found that we could best set up our study by incorporating aspects of single-sport research and multi-sport comparisons.Clarke and Norman (1995) focuses on the EPL to analyze the impact of home field advantage on the outcome of soccer matches from 1981-1990.The authors established a dummy variable for each team that competed in the EPL during those years.They found that the difference in home field advantage each year was statistically significant.While these findings may be outdated by now, the magnitude of home field advantage may have changed since then.This piece of literature does not omit any information that is needed to replicate the data as it presents each computation we would need if we ever felt the need to replicate something similar in our experiment.Nevill et al. (1996) uses the claim from Schwartz and Barsky (1977) that there was a significant relationship in home winning percentage and attendance in the MLB to suggest the importance of their study of the home field advantage in English and Scottish soccer.They found that home teams won 48% of their home matches when playing in front of low-density crowds.In contrast, teams playing in front of high-density crowds won 57% of their home matches.Nevill et al. (1996) found that this conflicts with the findings from Dowie (1982) and Pollard (1986) that both absolute crowd size and crowd density, respectively, have no significant effect on home wins.Kotecki (2014) reached a similar conclusion using a logit regression.Their analysis shows that a one standard deviation increase in attendance will result in a 2.7% greater chance that the home team will win.For the reasons set forth in both Nevill et al. (1996) and Kotecki (2014), we decided to include a regression of home attendance and home win percentage in our study.Nevill et al. (1996) also highlighted the importance of referee behavior in the outcomes of sporting events as suggested by Agnew and Carron (1994) given that referees have been proven to officiate matches in favor of the home team (Greer, 1983).This study tries to explain that the home field advantage in both English and Scottish soccer is associated with attendance as explained by the refereeing behavior in the 1992 and 1993 seasons.In essence, the authors are trying to disprove the findings of Dowie (1982) and Pollard, 1986).Nevill et al. (1996) quantifies referee behavior by recording the number of sending offs and converted penalty kicks in the 1992 and 1993 seasons.The study found, with statistical significance, that leagues with higher attendance exhibited a greater home field advantage than leagues with lower attendance and that the home teams of these higher-attendance leagues also displayed less sending-offs and more penalties converted.These findings conflict with those from Dowie (1982) and Pollard (1986).Nevill et al. (1996) suggests this is the case because Dowie (1982) and Pollard (1986) did not test in leagues with lower attendance.In addition, Kotecki (2014) also found that referees tend to call games in favor of the home team.These findings suggest that some measure of referee behavior is going to be necessary in our experiment.While this may be the effect that we are looking to conclude in our paper, we believe that there is a more accurate measure of referee behavior than set forth in Nevill et al. (1996).It is important to note that the number of sending offs and converted penalties may, in fact, be correlated with each other since it is very possible to receive a red card and give away a penalty kick in the same play.However, evidence from Altman (2014) tends to suggest that incurring a red card and a penalty kick simultaneously resulted in no extraneous disadvantage and, instead, scores suggest the two effects had occurred separately.Regardless, we have decided to incorporate a referee behavior variable in our experiment as suggested by Nevill et al. (1996) but choose to do so in a form of our own choosing.
In addition to our other independent variables, we wanted to control for any effects that may arise from in-game statistics.For example, Chatterjee et al. (1994) finds that field goals, free throws, rebounds and turnovers are statistically significant in their research.Among these, independent variables are also the assist and steal differentials.In Carmichael et al. (2000), the authors attempted to analyze a production function that has been used for measuring MLB teams' success and transpose it to the EPL.They state that they are the first people to create a production function model for the EPL.In the production function, they include variables such as goal differential and shot differential after identifying them as key metrics in a team's performance and, as such, they exemplify statistical significance.The research mentioned above suggests that the inclusion of in-game statistics as independent variables in our study will help us eliminate endogeneity through omitted variable bias from our model.Pollard and Gómez (2014) tested to see which country in all of the UEFA soccer leagues exhibited the largest home field advantage between the years 2006 and 2012.The study found that it was the greatest in Nigeria (86.82%).Bosnia-Herzegovina, Guatemala, Indonesia, Algeria, Bolivia and Ghana all exhibited home field advantages that were in between 70 and 80%.This evidence suggests that we must control for league in our data because it shows an effect that supports our alternative hypothesis that different leagues have different home field advantages such that HA  0.
In his experiment, Omotayo (2003) uses the number of goals scored as his measure of home field advantage instead of winning percentage.This is useful in a study only considering soccer because it avoids the complicated points system used in soccer leagues around the world.They use differing point totals associated with the potential outcomes of a match; a win, draw, or loss.However, for the purpose of our experiment, we find this method to be fruitless given that it cannot be measured across sports.A goal is the only method of scoring in soccer and it counts as one.In the NBA, one can score one point for a free-throw, two points for a field goal and three points for a three-pointer and teams tend to score over a hundred points a game.In football, a touchdown is worth six points, an extra point is worth one, a field goal is worth three points and a safety is worth two.Here, we can see how the scoring systems between different sports can differ immensely.Thus, we must find a way to measure home field advantage in terms of a percentage rather than points as suggested by Omotayo (2003).Reade et al. (2020) links the COVID-19 pandemic with a significantly smaller home field advantage in sport.Via a series of natural experiments, they are able to test whether social pressure affects referee behavior and outcomes of European soccer matches.They observe that, in any given match played in an empty stadium, referees cautioned the visiting team significantly less often.In fact, they did so by over one third of a yellow card per match.Thus, the authors conclude that the referees favored the home team less in matches played behind closed doors.Even though the spread of COVID-19 altered the myriad ways we consume sporting events, these results highlight a recent theme in sports economics; that home field advantage decreases with the absence of fans.Naturally, this group of literature is fairly limited, due to the recent nature of its subject matter.Our intention is to build upon this literature and incite further research into home field advantage and its many factors.

Conceptual Framework
In order to measure the success of teams in the UEFA, NBA and NFL, we need a dependent variable that represents each sport's metric of success.In the NBA and NFL, the success of a given team's season is measured by their record, the combination of wins and losses accumulated throughout the season.The teams with the most wins are then seeded into the playoffs to compete towards the championship.In the UEFA leagues, success is measured by a team's position on the league table.The table is determined by how many points a team has acquired over the course of the season.In European soccer, when a team wins a game, they are awarded three points to add to their total points tally in the league table.For a draw, each team wins one point and when a team loses, they earn zero points.At the end of the year these points are added up and the team with the most points wins the league.Given that there are two different measures of team success in our three sports, we will have to create two dependent variables; one for UEFA and one for the NBA and NFL.We decided that, since the primary goal of NBA and NFL teams is to acquire as many wins as possible in their season, we will make the winning percentage (win_percenti) as a dependent variable for those two leagues.The winning percentage of each team will measure how much their record grows with each game.For the UEFA, we chose to incorporate the natural log of average points per game (ln_avg_ppgi) as the dependent variable given that the primary goal of European soccer teams is to gain as many points as they can in a season.The natural log of average points per game will measure how many points a team will earn each game as a percentage.By splitting our dependent variable into two different measurements, we are accounting for the fact that success is calculated differently in UEFA leagues than it is in the NBA and NFL.That is to say, we are measuring the improvement in each team's season after each game by either how many points or how many wins they acquired from that game.
Given that we are measuring the home field advantage of teams in our respective leagues, we use a dummy dependent variable (homei) to describe which games are home and away.If a team is playing at home, then this value will equal one such that home equals one.When a team is away, this value will equal zero.Nevill et al. (1996) and Kotecki (2014) both suggest that we should include a variable for referee behavior in our models.We chose to adopt our own interpretation of such a variable because we found that the measurement for referee behavior set forth in Nevill et al. (1996) could potentially generate biasedness.We measure referee behavior in soccer by the differential number of cards received per game (fouls_diffi).A yellow card is worth one and a red card is worth two given that if a player receives two yellow cards in a match, he will be given a red card and subsequently sent off the field.For example, if Arsenal average is 2.4 cards per match, they are usually awarded with slightly more than either one red card or two yellow cards in a given match when compared to the other team.We measure referee behavior in the NBA by the differential number of fouls committed per contest (fouls_diffi).Similarly, we measure referee behavior in the NFL by the differential number of penalties committed per contest (penalties_diffi).
We include a fixed effect for the previous four seasons in our experiment as suggested by Clarke and Norman (1995).This is because Clarke and Norman (1995) find the differences in home field advantage between years to be statistically significant.Even though their research was performed thirty years ago, we still deem it necessary to account for the year in our models.Not only does this remove potential endogeneity from our model, but it also allows us to collect a larger sample size in our data.By including data from four years, 2016-2019, we can test if this relationship found by Clarke and Norman (1995) still holds true today and if there is any statistical significance between these more recent years.
We choose to include various in-game statistics from soccer, basketball and football as independent variables in our study.This idea was presented in Carmichael et al. (2000).He proposed a production function that includes variables such as goal differential and shot differential.Eventually, these differentials were found to be statistically significant in his experiment.In our study, we create variables for both of these metrics for our UEFA regressions (goal_diffi; shots_diffi).Similar to Chatterjee et al. (1994), we identified that the differentials in score, rebounds, assists, turnovers and steals be given independent variables for our NBA regressions (score_diffi; rebounds_diffi; assists_diffi; turnover_diffi; steals_diffi).In the NFL, we came up with our own ingame statistics differentials for our independent variables.They are yards, score, redzone, passer rating and turnovers (yards_diffi; score_diffi; redzone_diffi; passer_rating_diffi; turnover_margini).
Similar to Pollard and Gómez (2014), we included the different leagues and divisions in each sport.In their study, they found that there was a statistically significant difference between European soccer leagues.For this reason, we tested for home field advantage in the five most prominent leagues in the UEFA.They are the Bundesliga, La Liga, the EPL, Serie A and Ligue 1.We then paralleled this practice to the NBA and the NFL where each league is divided into multiple divisions.In the NBA, these are the Atlantic, Central, Southeast, Northwest, Pacific and Southwest divisions.In the NFL, the entire league is divided into two conferences, the American Football Conference (AFC) and National Football Conference (NFC).These conferences are then divided into four separate divisions: The North, South, East and West divisions.We included all of these in our model to check for any statistically significant difference between them.
We also include average_salaryi, as the independent variable in the study.This value indicates the average player salary on a given team.The payroll each team incurs usually indicates the amount of talent they have tied to their roster.For example, players like Christiano Ronaldo, Stephen Curry and Aaron Rodgers will make significantly more than an average player.Thus, a team consisting of higher-contract players has a greater chance to win games than a team filled with lower-contract players.We incorporate this into our model to control for any fixed effect that may arise from teams with larger payrolls due to the host of talent attached to their rosters.The idea of these large salaries influencing our data is briefly expanded upon in Sharp (2019).He claims that players with large salaries will welcome the chance to play in away matches because, after their athletic requirements are fulfilled, these high-earning athletes can spend their money and enjoy time with their friends in the away cities.This is particularly convincing when you consider a team from a smaller city such as the Green Bay Packers traveling to a bustling, lavish city like Los Angeles to play the Rams.

Data Description and Analysis
We use panel data for the years 2016-2019 from UEFA, NBA and NFL statistics databases.Before running any of our main descriptive models, we observe the relationship between each sport's home and away advantage.For the UEFA, we have created a scatter plot of every team's average point per game and their home or away status.For the NBA and the NFL, we have created a scatter plot to observe the relationship between home and away win percentages.Keep in mind that home and away status is represented by the homei dummy variable.As such, a home team will score a value of 1 and an away team will score a value of 0 for this independent variable.Any team that is away will represent the reference category.
Figure 1 shows the map of country home field advantages as found in Pollard and Gómez (2014).In Fig. 2, we can observe the upward sloping fitted line which indicates that there is a difference in means of the average points a team earns when they play at home versus on the road.As we can see, teams who play at home seem to earn more points per game than teams who play away, on average.In Fig. 3 and 4, we find the same relationship, however, it seems that the trend line for the NFL is not as dramatic as it is for the other two sports.We also observe the relationship between a home team's attendance and the outcome of the game in question.Figure 5, 6 and 7 display the relationship between home team success and home attendance.For the UEFA, we use average points per game and for the NBA and NFL, we use winning percentage.We noticed there is a distinct, positive correlation between attendance and the outcome in case of UEFA and NBA.However, we observe very weak to almost non-existent correlation in case of the NFL.We supposed this is the case because NFL stadiums are usually pretty full especially when compared to NBA stadiums given that fans only have sixteen chances to watch their team play each season (Goodell, 2019).
Lastly, we analyze referee bias for the home team.In Fig. 8, 9 and 10, we illustrate the relationship between the fouls or penalties awarded to the home and away teams in any given contest.These figures demonstrate that, for all three sports, the home team is less likely to be called for a foul.This brings to light an unverifiable question, do teams actually commit less fouls and play a more error-free game at home, or do the referees produce a bias and call less fouls on the home team?Fig. 8, 9 and 10 suggest the ladder, but it is certainly not proven.

Empirical Models and Estimation Methods
In our study, we ran a bivariate model, a pooled multivariate regression, a one-way fixed effect regression and a two-way fixed effect regression with both fixed effects models using the Least Squares Dummy Variable (LSDV) method.For the bivariate model, we have created a home dummy variable that represents either home or away.Our bivariate model for the five main UEFA leagues, the NFL and the NBA are: i. UEFA: Ln(avg_ppg) =  + 1 homei + i ii.NFL: In regards to the NBA, game results are binary in that a team must either win or lose as per the (NBA Rulebook, 2019).If a contest results in a draw at the end of regulation, successive overtimes will be played until a winner is decided.For the NFL, their rulebook (Goodell, 2019) also has an overtime rule; however, only one overtime is played and if the score remains tied, then the end result is a draw.This rule has supplied the league with very few draws as seen in historical NFL team records.Draws rarely happen over the course of an NFL season which is why, as per the rule book, the NFL seeds its teams using their record, a measure of winning percentage, much like the NBA.Therefore, for the NFL and NBA regressions, we will use winning percentage (win_percent) as the dependent variable.
Win percentage in the UEFA leagues would not encapsulate the home field advantage given that matches end in draws quite frequently.If we want to take these draws into account, we cannot use win percentages as we did with the NBA and NFL.Given that soccer measures team success in terms of points rather than winning records, we will use the natural log of average points per game (ln(avg_ppg)) to represent the dependent variable.
We have created the following multivariate models for each sport: Since these multivariate models include other independent variables, they may avoid potential sources of endogeneity that stem from Omitted Variable Bias (OVB).In these regressions, we will discern how each variable affects the percentage change in a given team's average points per game or win percentage.All the above variables are measured on an average per game basis for every team for each season and are subtracted by the collective average of each team's opponents for each season as well.For each of the following definitions, the information came from each respective sport's official rulebook.For the UEFA regression, the goal_diffi variable represents the goal difference between a team and their opponent on a per game basis.A goal is scored when the ball completely crosses the goal line which is positioned between the two goal posts.Shots_diffi is the difference between the number of shots each team has on goal per game.Shots are considered to be a directional attempt to score a goal.Fouls_diffi are the differential number of yellow cards and red cards a team receives in a match.As per the FIFA.com(2015), a yellow card is seen as a foul that comes from an aggressive infraction and cautions the player to not commit another offense.A red card is when the foul is exceedingly aggressive and the player is then thrown out of the game and the team cannot substitute for that player.If a team receives a red card, they will play with 10 men versus the 11 opposing players for the remainder of the game.A player can get a red card from also earning two yellow cards in a single match.Lastly, average_diffi is each team's average salary per player.This is calculated by taking a team's total salary and dividing it by the number of players on the roster; this is measured in pounds for all teams.
For the NFL, yards_diffi is the average difference in a team's total yards gained and yards allowed per game.In the NFL, yards are an offensive metric that measure how far a team travels down the 100-yard playing field.Score_diffi is the difference in the score.In the NFL, scores can be counted from a touchdown, an extra point, a two-point conversion, a field goal, or a safety.A touchdown occurs when a team crosses the goal line with possession of the ball and results in 6 points.An extra point is awarded to the team if they kick the ball through the goalposts after a touchdown is scored.A two-point conversion is successful if a team crosses the goal line after a touchdown is scored instead of conducting an extra point.A field goal is when a team kicks the ball through a goal post after failing to reach first down to earn 3 points.Redzone_diffi is the difference in red zone opportunities.As per the NFL handbook (Goodell, 2019), a redzone opportunity is when a team is within twenty yards to the endzone.Passer_rating_diffi is the difference in an individual team's quarterback passer rating and that of the opposing quarterback.Passer rating, as defined by NFL statisticians, measures a quarterback's performance in a game.NFL statisticians and analysts such as DaSilva (2017) have also revealed that the quarterback is the most important position on the team and, thus, have the greatest influence on the outcome of a game.Therefore, we have considered quarterback performance in our model.The variable turnover_margini is the difference in takeaways between each team.A takeaway is when a defending team intercepts the quarterback's pass or recovers the ball after it has been fumbled by the offensive team.Average_salaryi for the NFL is measured in the same way as the UEFA's average salary except it is counted in dollars.Lastly, penalties_diffi is the difference between penalties drawn by a team and the penalties they commit.Penalties are given to a team if they break the gameplay rules set forth in the NFL handbook (Goodell, 2019).
For the NBA, score_diffi is the average difference between the two scores of a game.As per the NBA Rulebook, 2019, a team scores two points if the ball goes in the basket from within the three-point line.A threepointer occurs if the player scores from beyond the threepoint line.A foul shot is taken from the free throw line when a foul is committed.Each foul shot is worth one point.Rebounds_diffi is the average difference between each team's rebounds during a game.A rebound is awarded to a player if he grabs the ball from a missed shot.Assists_diffi is the average difference between the two team's assist totals per game.An assist is awarded to a player if the player he passes the ball to scores a point.Turnaover_diffi is the average difference of two team's turnover average per game.As defined by the NBA, a turnover occurs when a team loses possession of the ball before a shot attempt is made.Steals_diffi is the difference between each team's number of steals per game.A steal is when a team takes the ball from the opposing team without the ball going out of bounds or being scored.Fouls_ diffi is the difference between the number of fouls a team draws versus what they commit per game.Lastly, Average_salaryi is the average salary per player for every team which is measured in the same way as the NFL.
We also use fixed effects models (one-way fixed effect models and two-way fixed effect model) to fight endogeneity for each sport since we believe that there may be underlying fixed effects associated with various aspects of each sport.Since each league or division has a different difficulty on average, we figure that some teams may benefit from home field advantage more than others do.That is to say, as UEFA leagues or NBA/NFL divisions become more competitive, we predict that the highly anticipated, close matches featuring teams at the top of the standings may have a greater/lesser home field advantage and, thus, will affect the dependent variables.For example, the EPL often features around six teams that are good enough to compete for the league title whereas a league like the Bundesliga may only have one or two.Thus, there are more tightly contested, high-profile matches in the EPL than there are in the Bundesliga, on average.In the NFL, teams are split up in divisions of four and each team must play each division rival twice, once at home and once on the road.Therefore, with only 16 games and 32 teams, not every team plays against each other and, thus, one team could have a much more difficult schedule than another.In addition, in the NBA, each team plays against each division rival four times, but a team may only play other teams twice throughout the course of the season.Therefore, we predict that the strength of schedule matters for the NFL and NBA as well and must be accounted for appropriately.Following from this, we will run the following regressions to account for differences in league and division: vii.UEFA: ln(avg_ppg In the UEFA model, we consider league fixed effect which is represented by i with the Bundesliga as our reference category.This model allows us to control for any differences in league home field advantage.The i and i variables for both the NFL and the NBA represent the fixed effect for the differing divisions in each respective sport.For the NBA, the Northwest division is our reference category and for the NFL, the AFC East is our reference category.Nevertheless, our data may still be pooled in these regressions which is why we decide to run a two-way fixed effect model for each sport.In addition to concerns we have about an underlying fixed effect across the different leagues and divisions of our three sports, we also figured that there may be another fixed effect in the different years we were testing.We are going to run the following regressions to control for this: x. UEFA: ln(avg_ppg In this model, the fixed effects are represented by κt, t and t respectively and account for the different years in our regression; 2016, 2017, 2018 and 2019.The reference category is 2016.
The COVID-19 pandemic has caused major sports to rethink how they are going to carry out the rest of their seasons.The Serie A, Italian Soccer League was the first sporting body to suggest the idea of playing the rest of the season's games behind closed doors with no fans, effectively nullifying any home field advantages.In light of these new developments, we also measure the causal effect of home field advantage that sports teams receive with no fans in their home stadium.For the UEFA contests played under the COVID-19 regulations, we use the same model as we had before.However, Ligue 1 is not included given that Ligue 1 ended their season in March while the others restarted their seasons in the summer months.We also cosider a two-way fixed effect for this regression.We do not include COVID-19 regressions for the NBA and the NFL since they have yet to start their respective seasons.The NBA plans to continue their season in a bubble in Orlando, thus meaning there are no home games while UEFA leagues still play games at their home stadiums.Below is our regression for the UEFA COVID-19 model: In the above two-way fixed effects regression, we included Zi to represent the fixed effect across only four of our leagues since Ligue 1 was not included in the data set.The Bundesliga is the reference category.The variable ρt represents the fixed effect across the different months when matches have taken place.These months are May, June and July; and May is the reference category.The only league that played in May is the Bundesliga.All four leagues played in June and only the Serie A, EPL and La Liga have continued their seasons into July.Even though the final Serie A match took place on Sunday, August 2 nd , it will be grouped in with July data because the round of matches started in July.
We also conducted some supplementary models to aid our understanding of the variable relationships at hand.These supplementary regressions are included in the appendix.We have decided to include any regressions that determine home field advantages for each division and each UEFA league.In order to find the effect of home field advantage for each league, we will run a series of multivariate regressions to see the exact impact each UEFA league or division has.In addition, we have included extra regressions that showed relationships between two given variables such as wins and attendance that can also be found in the appendix.

Empirical Results and Analysis
For our models, we used a linear model as we are looking at season totals versus the probability to win one game.If we were looking at the probability to win one game, we would use a tobit model.The first regression we ran in our experiment was a simple bivariate regression depicting the relationship between a team's average points per game or win percentage and their home/away status.The results of this regression can be found in Table 1 for the UEFA, Table 2 for the NFL and Table 3 for the NBA.These three tables show that playing at home has a significant advantage than when playing on the road for the bivariate model.Table 1 represents that one additional UEFA home game leads to a 41.6% increase in the number of points compared to the game on the road.For the NFL (Table 2), the bivariate model displays that playing at home means there is a 13.3% better chance at winning an additional game when at home than on the road.Lastly, for the NBA (Table 3), there is a 17.2% better chance at winning an additional game at home than on the road.These models indicate that there is a substantial advantage when playing at home.We also ran multivariate regression models for all three of the sports.
We see a drastic change in the home status variable's coefficient in the multivariate models.For every additional UEFA home game, ceteris paribus, the home team has a 5.28% increase in the amount points a team wins for each individual home game versus away game.For the NFL, each additional home game is associated with a 3.2% increase in the chance to win if the game is at home versus away.Both multivariate regressions are significant at the 95% confidence level.However, for the NBA, there is only a 1.5% increase for a team to win an additional home game versus a road game.This metric is not significant at the 95% confidence level.It is clear there is no statistically significant difference between playing at home or the road for the NBA.However, this data seems to be pooled for all three sports since it groups all the UEFA leagues and the divisions for the NBA and NFL together in one group.
In order to account for any fixed effects amongst the different leagues and divisions of our sports to fight endogeneity, we need to incorporate the five UEFA leagues and the NBA and NFL divisions we have defined in our model.To do so, we perform a one-way fixed effect model using the LSDV approach for all three sports.These regressions are captured in all three of the tables.In the fixed effect model for the UEFA leagues, our reference league is the Bundesliga.For the NFL, our reference division is the AFC East and for the NBA, our reference division is the Northwest.
In our one-way fixed effect model for the UEFA, we can see that the coefficient of home field advantage is 0.0473 and is statistically significant at a 95% confidence level.This implies that one additional UEFA home game leads to a 4.7% increase in the number of points earned as opposed to an away game.However, we see no significant differences in home field advantage between the different leagues.While the NBA and NFL do display a significant home field advantage as entire leagues, we found that there is yet a significant difference between the individual divisions in the NFL and the NBA.For the NBA, there is a significant difference between the Northwest division and the Central division at a 95% confidence level.For the NFL, there is a significant difference between the AFC East and the AFC North.These examples illustrate that there are different levels of competitiveness for each division.
We also tested for the year each season was played since Clarke and Norman (1995) observed significant change in home field advantage over time.Therefore, we extended our model to become a two-way fixed effect regression including year dummy variables considering 2016 as the reference category.The last column in Table 1, 2 and 3 demonstrates the results of the two-way fixed effect models.
In controlling for the year, our main explanatory variable coefficients for all three sports exhibited very minimal changes in home field advantage.For example, the UEFA leagues coefficient ranged from 0.0473 to 0.0472.Nevertheless, this metric is still statistically significant at the 95% confidence level.The other home advantage coefficients for the NBA and the NFL had similar impacts when controlling for each individual year.Neither of these coefficients were statistically significant.Because there was no noticeable change in our main explanatory variable (homei) for any of the sports, we can safely say that there is no significant difference in home field advantage for different years with regard to all three sports.After these four main regressions, we decided to test which UEFA league or NBA/NFL division had the largest home field advantage.To do so, we ran a group of five regressions to represent the five UEFA leagues, eight regressions for the NFL divisions and six regressions for the NBA divisions.Eventually, we found that controlling for the different leagues or divisions does have a significant impact on home field advantage.In testing for division within the NFL especially, we observed that some divisions were statistically significant at the 95% confidence level while others were not.These results can be found under Appendix A, B, C and F.
Table 4 highlights our COVID-19 regression results.We ran bivariate, multivariate, one-way fixed effect and two-way fixed effect models for any contests played under some form of regulation arising from the novel Coronavirus outbreak.The only sport who has done so has been soccer.The five UEFA leagues, except for Ligue 1, do not allow spectators to be present at matches as Ligue 1 discontinued their season.In running these regressions, we can test how important spectators are in the quantification of home field advantage.
Table 4 demonstrates that there is no significant home *Statistical significance at 90% confidence field advantage when there are no fans present.fact, there is a slight negative effect when playing at home given that our main explanatory variable is -0.00232, -0.00109 and -0.000412 for our multivariate, one-way fixed effect and two-way fixed effect models, respectively.When analyzing our two-way fixed effect model, it is practically zero thus alluding to there being no home field advantage at all.However, there is a limitation as each team has only played eight to eleven matches under COVID-19 regulations instead of the full 38-game season schedule.However, since there is no statistically significant effect from playing at home during these games, we fail to reject the null hypothesis.Therefore, we find that spectators must be in attendance in order for there to be a significant advantage for the home team.
Lastly, since we used winning percentage as the dependent variable for both the NBA and NFL, we ran a regression with winning percentage instead of average points per game for the UEFA leagues to test if it changed our findings.This analysis is located in the appendix.Since we had to exclude any tied matches from this data set, we do not mention its results in our paper given that we believe this test is not as accurate as our preferred model that uses average points per game as the dependent variable.
We hypothesize that these results are due to several factors, two of which are crowd-density and increased TV viewership.The literature has found that dense crowds increase home field population.This, in tandem with the fact that NBA crowds have historically been less densely populated than European soccer crowds, proves that our results could be factual.TV viewership of sports has increased dramatically in the US given its recent increases in HDTV sales which far exceed those of the UK.Perhaps the shift from in-person spectating to television-spectating has decreased home field advantage in America relative to Europe.Again, our findings support this.
When running the models without goal or score differential like Single Reade et al., (2020), we saw no changes in our main coefficients being significant or not which is seen in Appendix O-R.With all of that said, the models we ran (Table 1 -4) originally hold up.
When analyzing the results of the tables, the fixedeffect model can be used rather than the random-effects model because of the Hausman test not giving us a significant relationship for any of the tables.When looking at Appendix K-N, there is no difference between the two models.Because of this relationship, the fixed effect model can be used.When running the random effects model, we used cross sectional data across the different years.However, we do not see a difference with the fixed effects model.
When it came to the tobit model, we have seen similar results with the coefficient on our main variable of interest, home, not seeing any change as well.This can be seen in Appendix G-J.Therefore, with the coefficient staying the same for all the models.We can see that there is no reason to use the tobit model over the OLS fixed-effect model.

Conclusion
In regard to our findings, we have found that strength of schedule is a determining factor for home field advantage in American sports as both the NBA and NFL have divisions that do have a home field advantage.When analyzing the one-way fixed effect model in Table 2, NFL home field advantage is no longer statistically significant compared to the multivariate regression model.The reason being is that only one of the divisions in the NFL reject the null hypothesis and have a home field advantage; this result can be seen in our appendix Table B and C, while the other seven divisions do not display significant home field advantage.We believe that this division has home field advantage due to the strength of schedule for these two divisions and because there are rivalries within them.Pollard and Gómez (2013) suggest that passionate fans could yield to their team winning more at home.Therefore, in the AFC North, home field advantage would be more prevalent as Stein (2012) reports that the Steelers and the Ravens have one of the biggest rivalries in the entire NFL and these two teams are known to be top tier teams for the past few years.These two elements equate to more passionate fans at each home as even when they do not play against one another, winning can put one team ahead in the standings over the other.Although the NFL and the NBA both do not statistically have a home field advantage as a league, individual divisions still have one.
We also believe that the reason that there is home field advantage in the UEFA leagues compared to that of the NBA and the NFL is due to a high-density crowd causing teams to play better that is supported by Nevill et al. (1996) and Kotecki (2014).As shown with Fig. 5, a multitude of NBA teams have low attendance numbers even though NBA stadium sizes have almost the same capacity.Also, for the 2018-2019 season, USA Today (2019) reports that a record number of sellouts for NBA games was recorded which was 760 out of 2460 games.This translates to low density crowds for NBA teams.However, in the EPL, Hoskin (2020) reports that each team averages a stadium capacity of 90% and the top seven teams average well above 99%: Virtually a sell out every game.Given the larger crowd densities in the EPL, we can potentially observe why European leagues exhibit a home field advantage while the basketball teams do not.For the NFL, Fig. 6 displays that almost every team has the same amount of game attendees, thus alluding to a high-density crowd to every game as every NFL stadium is relatively the same size.However, with only 16 games, we hypothesize that teams might be more desperate to win on the road thus causing home field advantage to not be prevalent in most divisions.For UEFA leagues, they have a high-density crowd at almost every game and they also have 38 games which is more than the NFL's thus meaning the road team is not as desperate to win thus causing them to have home field advantage.In addition, the COVID-19 games are played with no fans which means there is a low-density number of attendees present at a game.Our COVID-19 regression results support the importance of having a high-density crowd for a home game as home field advantage is not statistically significant.Sharp (2019) states that increased TV viewership because of cheaper HDTVs and increased access to internet streaming services may incentivize sports fans to watch their favorite teams play from home.This technological factor may play a role in calculating home field advantage.This assumption makes even more sense with our data when considering the distribution of HDTVs in both Europe and the UK.While The Benton Institute for Broadband & Society reports that, as of 2015, over 80% of American households own an HDTV (Frayer, 2015), the Statista Research Department claims that only 57% of UK households own an HDTV (Vailshery, 2021).Given the large distribution of HDTVs in the US, it would make more sense that American sports fans watch more games from home than fans from the UK.Perhaps therefore we do not observe statistical significance of home field advantage in American sporting leagues such as the NBA and NFL and we do in European soccer.
Nevertheless, after running numerous tests, including many independent variables and fixed effects, we can conclude that there is enough statistically significant evidence to conclude that playing at home will result in more average points per game than playing away for UEFA soccer teams.However, the same cannot be said for the NFL and the NBA.For our UEFA model, both our one-way fixed-effect and two-way fixed-effect models had a coefficient of 0.047 and a p-value of less than 0.05.This would suggest that there is, in fact, an innate advantage to playing in your own stadium as a team earns 4.7% more points per game at home than on the road.However, for the NBA and NFL, we found that neither of the fixed effect models had enough statistically significant evidence to reject the null hypothesis, thus indicating that home field advantage is not present in those two leagues.We know this to be true because, in our experiment, we ran multiple regressions so that we could eliminate any possible sources of endogeneity in order to receive an unbiased estimator for each model.
In addition to proving the existence of home field advantage, we also were able to draw other, supplementary conclusions from our experiment.We found that each UEFA league does exhibit its own home field advantage.That is to say, the utility each team receives from playing at home differs from league to league as our main explanatory variable, homei, did change from 0.0477 to 0.0529.Therefore, we can conclude that there is a difference between the home field advantage of UEFA's top leagues.Additionally, we found that some divisions in the NFL and NBA exhibit home field advantage while others do not as the AFC North and the Central division both have a p-value that is less than 5%.The rest of the divisions for NBA and NFL do not display this and consequently fail to reject the null hypothesis which indicates that there is no home field advantage.Therefore, home field advantage may not be present in all the divisions and the league as a whole, but it is still present in some of these divisions.We also found that playing in front of no fans for the UEFA leagues because of COVID-19 causes home field advantage to dissipate as it is not statistically significant.
We believe that this research contributes a model to the current literature that not only measures the home field advantage across various sports, but also within the divisional sectors of each sport.In testing across sports, like Gómez et al. (2011), we are able to deduce which sports rely more on their fans to contribute home field advantage.What's more, these observed trends in home field advantage indicate room for potential new coaching strategies to be implemented in match preparation.For example, English soccer teams may need to prepare more for away matches and American basketball teams may not need to practice as hard before an away game and can spare some time for rest and recovery.
In addition to these strategic recommendations, our paper also suggests potential inmate differences between sport that have yet to be realized.For example, Gómez et al. (2011) concludes that Spanish rugby employed a relatively high home field advantage that is likely due to "the continuous, aggressive and intense nature of the sport."More specifically, our model captures the innate differences in home field advantage within divisions as described above.This has not been tested before.Differences in home field advantage with respect to divisional rivalry matches would be of the utmost importance to sporting teams and coaches because they stress the need to prepare more for divisional contests rather than out-of-division games.For example, the Philadelphia Eagles would benefit from extended practice sessions before playing the Dallas Cowboys more so than they would against the San Diego Chargers regardless of team record, based on the fact that they share a division with the Cowboys.The regressions that were tested to find this relationship incorporate independent variables such as team salaries, referee behavior and in-game statistics that have not been used in the same analysis before.Our study includes an analysis of the soccer games played under the empty stadium regulations during the COVID-19 pandemic.Thus, our experiment can measure any statistically significant difference between playing in front of fans and without fans.
In gathering research and performing our study, we came across some potential limitations of our experiment.First, we did not account for the distance between home and away teams as Pollard et al. (2008) and Van Damme and Baert (2019) did in their respective studies.They calculated the distance that away teams travel to play in a match and various other measurements of distances such as language, culture and climate.Including a variable representing some category of distance in our model would potentially increase the legitimacy of our experiment by limiting potential sources of endogeneity.However, we chose to neglect this intuition because we found that there are not very many distances, outside of location, we would incorporate into our model for the NBA and NFL given that both leagues take place within a singular country.Second, we did not account for any historical or political effect surrounding the games.Upon controlling for competitive balance, crowd size and distance traveled, Pollard and Gómez (2013) found that home field advantage is the largest in the Balkan regions of Europe.They suggest that the heightened sense of territoriality among these historically conflict-stricken countries may have an effect on the athletic gameplay of the region.Perhaps the increased passion for their region drives the fans to appear at more matches or even interact with the players more during the games.Political and historical influences were not controlled for in our study which could lead to potential endogeneity.However, we think this would only exhibit a very small, if not negligible, effect in the NBA and NFL given that there has not been much conflict within the USA since the Civil War.

Fig. 10 :
Fig. 10: UEFA differences in fouls between home and away team

Table 3 .
NBA regression results