# Daily Fantasy Sports – Lineup Optimization 101

Lineup Optimization is something that we get asked about frequently, so we thought an overview of it would be a good topic for our first post.  In the following we will share some of our thoughts and go through a real world example using NFL data from the Week 14 games.

Regardless of the sport, we would define Lineup Optimization as the use of some process to generate projected points per game (PPG) values that are used identify a pool of players who are then analyzed to create the best possible unique lineups.  But it has been our experience that Lineup Optimization means different things to different people, and largely that depends on the type of player (casual or serious), bankroll, number of lineups played, and perhaps most importantly what predictive modeling they may be using.

Points Per Game – PPG

Simply put, PPG is a value that attempts to project a player’s performance for an upcoming game.  The projections are not only specific to a sport, but to the DFS site that you play on.  There are many methods that are used to calculate this and they range from very simple to very complex.  In addition to player level data, some of the other variables that can be used in PPG calculations include team level, head to head, scheduling, playing surface, and line/total data.

There are essentially three classifications of DFS players when it comes to PPG:

1. Those who utilize PPG data that has been calculated by someone else, whether that be a free or subscription based service.
2. Those who use their own methodologies to calculate PPG
3. Those who outsource the creation of an projected PPG algorithm for their own use

Many people who don’t have the time, desire, or expertise to generate their own PPG data simply source it from the internet.  There are countless sites, both free and subscription based, from which you can obtain “per-calcualted” PPG values.  A simple copy/paste or download provides the user the information they need to short list players for their lineups.  This approach works very well for many people who are happy to have someone else do the heavy lifting.  The downside is that you are locked into whatever methodology that your source is using.

Some DFS players create their own custom algorithms.  For these types of players, using their own experience and expertise is a big part of the enjoyment, and the challenge.  They are the backbone of industry and our core clients.  Typically these type of players use spreadsheet driven formulas to track and model their projections.  It is important to understand that this approach requires a commitment of time, and possibly money.  In addition to setting up systems to capture and process current and future data, it may also be necessary to procure historic data for back testing purposes.  The upside here is that you have the ability to implement your own strategies, and tweak them overtime.

Other DFS players look to outsource the creation of PPG algorithms  utilizing more complex mathematics such as Integer Linear Programming or the Knapsack Problem.  It goes without saying that these methodologies are somewhat more complicated than A + B = C.   For example, the question posed in the Knapsack Problem is that you have a knapsack that holds a certain amount of weight, and you need to decide what is best to put in it, based on the weight of the individual items and their costs.

The formula is as follows:

maximize \sum _{i=1}^{n}v_{i}x_{i}
subject to \sum _{i=1}^{n}w_{i}x_{i}\leq W and x_{i}\in \{0,1\}

While the applicability to DFS is easy to see, being able to leverage this level of mathematics is not for everyone.  Some of our clients who use these methodologies have actually partnered with PhD level mathematicians to define, back test, model data and create their own high level proprietary algorithms.  Obviously this approach is for the more serious DFS player and the cost can be significant.  You need to keep in mind that when outsourcing this, you are asking someone to create something that, if viable, is a very valuable asset.  And someone smart enough to create such a valuable asset is also going to be smart enough to know its worth.  On more than one occasion I have seen deals where the “brains” behind the algorithms not only got substantial upfront fees, but on going royalties based on their success.

Some final thoughts on PPG.

Using solid projection methodology is what separates winners from losers in DFS.  It is what makes it a game a skill.  No matter what approach you use, take the time to understand what is going into your projections and revisit how they hold up over time.  If you source your PPG player data from a site, email them and ask questions about their methodology; in many cases you will find that administrators are happy to interact with their users.  If you employ your own algorithms, play with them from time to time and back test new variables.  Go to sites such as RotoGrinders where you will find a wealth of resources and a community of informed DFS players.  One more thing, don’t forget to incorporate your own experience.  Believe me, we value stats, but we value experience even more.  You have probably watched thousands of hours of games; you don’t need regression analysis to tell you that turnovers are a bad thing.  You know what to look for, just figure out how best to extract that from whatever data you have on hand.

Creating Lineups

OK, so you are happy with your projections and you are ready to create your lineups.  Now what?

There are many things to consider when creating lineups, not the least of which is what site you play on.  One site may use a Flex and another won’t, player positions may vary between sites, there are salary cap differences; it suffices to say that you need to handle each sport for each site separately, though the basic approach is the same.

Whether manually or through some automated process you will need to supply the following:

• Player
• Position (Except for golf and NASCAR)
• Projected PPG for each player
• Current Salary for each player
• Percent of lineups that player can be or must be in(optional)
• Total number of lineups requested

For each sport there are billions of possible unique lineups.  For most of our applications we limit the number of players by position to some reasonable threshold where the user still gets to include  a large sample but the process remains manageable.  For example, with our standard Draftkings NFL application we provide for 5 quarterbacks, 8 running backs, 12 wide receivers, 5 tight ends, and 3  defense/special teams.

There are four basic approaches that can be taken when generating lineups:

1. Generate all possible unique lineup combinations.
2. Generate the best Top N number of lineups based on projected aggregate PPG.
3. Generate the best Top N number of lineups based on projected aggregate PPG, but limit the number of times a player can appear based on a percentage of the total lineups.  For example, if generating 100 lineups and Player X was set to a maximum of 25% he could appear in 0 to 25 lineups.  The practice of limiting how often players can appear in your lineups is commonly known as “setting exposure limits”.
4. Generate the best Top N number of lineups based on projected aggregate PPG, but a player must appear in a defined percentage of the total lineups.  For example, if generating 100 lineups and Player X was set to be included in 25% of them,  he would appear in 25 lineups.  This practice is sometime called “keying on players”.

Generating All Possible Lineups
This is as simple as it sounds, but it comes with two caveats, First, you need to understand that in doing this you are not in any way including your player PPG to create a hierarchical rank of the lineups; you are simply getting all possible combinations that fall under the salary cap.  Secondly, unless you plan on submitting 1,000,000 lineups you need to use a small group of players for each position.   You can use a tool such as this calculator to get an idea of how many unique combinations to expect based on your selected players and positions to fill.

Generating the Top N Number of Lineups
This process involves creating all possible lineups from your selected player list that fall under the salary cap, then ranking them based on the projected aggregated PPG.  The system then selects the Top N number of lineups as requested by the user.

Exposure Limits
Exposure Limits are a way of diversifying lineups so that players can only appear in up to the user defined X% of the total lineups.  It is worth noting that when using Exposure Limits it is possible that a player may not appear in any of your top rated lineups. It is also possible to set global limits so that no one player can appear more than the defined maximum.

Keying On a Player

Keying On a Player means setting requirements that a player must appear in X% of the total lineups.  There are various reason why someone may want to do this ranging from some intangible outside of their PPG calculation to just wanting to insure that that they have live players in their lineups going into Monday Night Football.  When Keying on players, a user must make sure that the sum of the percentages allocated by position does not exceed 100% since doing so would generate more then the requested number of lineups.

The Downside to Exposure Limit and Keying

On the surface these sound like good ideas, but there can be drawbacks to using them.

In the following I will focus on Exposure Limits, but the same holds true for Keying.

In this example let’s assume you want to create your Top 100 lineups.  Without Exposure Limits you would get your top 100 unique lineups irrespective to how many times any player might appear collectively in the group of 100 lineups.

Now let’s assume that you want to set exposure limits on three quarterbacks, four running backs, and five wide receivers.  What you will find is that the system may have to go hundreds, thousands, tens of thousands, or even more lineups deep to meet your diversity criteria.  It is important to understand that as a lineup is flagged as a “keeper”, based on say a quarterback, that same lineup may also include running backs and wide receivers that are also being tested, and if found counted against their maximums as well. So by the time you reach the best possible record for quarterback #3 (which in this example includes running back #1 and wide receiver #1), all slots for running back #1 and wide receiver #1 have been filled and therefore the best lineup for quarterback #3 must be omitted.  The system may then need to go fifty more lineups deep in order to find the first qualified lineup for quarterback #3, so you wind up with the 51st best lineup specific to that quarterback.

Examples of Exposure Limits and Keying

Below is an example which shows the impact of utilizing different lineup generation methods using actual Draftking’s Week 14 NFL data.

*** The players for this example were chosen randomly.  As luck would have it, this group preformed very well in the context of generating 100 lineups.  This purpose of this example is only to highlight the differences between methods used to generate lineups, and is not in any way an intended to exaggerate outcome expectations. ***

A few notes of the example data:

• The PPG data used in this example was sourced from the weekly Draftking’s data.
• For each player their actual Week 14 Draftkings’ scoring has been included for the purpose of tracking what the actual outcomes would have been.
• The lineups were created specific to the Draftkings NFL format
• For the purposes of defining performance and ROI we used Draftking’s NFL $2.5M Millionaire Maker Special which had a$3 entry fee; so our collective cost to enter 100 lineups was $300 and the contest paid out through the 217,000th best lineup. • 1,374,056 unique possible lineups fell under the salary cap and qualified for the analysis. • For the purposes of this example we chose to only apply limits to only selected players • Three types of lineup creation methods were used: 1. Top 100 – The Top 100 lineups based solely on projected PPG 2. Up To – Top 100 with players capped – Best 100 Where Selected Players Occurrences Are Capped – The Best 100 lineups in which selected players can only appear up to a defined maximum number of times 3. Must Have – Top 100 with player requirements – The Best 100 lineups in which selected players must appear a defined number of times List of Players With Usage Counts By Lineup Type • All players and Exposure Limits were randomly chosen; for the example we selected 5 QB’s, 8 RB’s, 12 WR’s, 5 TE’s, and 3 DST’s. • Up To and Must Have percentages were applied to the following players: 1. DST – Seahawks – 60% 2. QB – Cam Newton – 20% 3. QB – Ben Roethlisberger – 25% 4. RB – Doug Martin – 15% 5. RB – Todd Gurley – 15% 6. RB – DeAngelo Williams – 15% 7. WR – Antonio Brown – 25% 8. WR Brandon Marshall – 20% 9. WR AJ Green – 20% Notes on the above worksheet: • The complete worksheet can be viewed by clicking on the button at the lower right of the embedded sheet above. • The most important thing to note is the difference in how many times certain players appear in the collection 100 lineups based on the method used. For example: Players With Limits • DST – Seahawks – Appear 96 times in Top 100, 60 times in Up To, and 60 times in Must Have • QB – Cam Newton – Appear 2 times in Top 100, 20 times in Up To, and 20 times in Must Have • RB – Todd Gurley – Appear 0 times in Top 100, 3 times in Up To, and 15 times in Must Have • WR – AJ Green – Appear 1 times in Top 100, 5 times in Up To, and 20 times in Must Have Players Without Limits • DST – Packers – Appear 3 times in Top 100, 25 times in Up To, and 25 times in Must Have • QB – Russell Wilson – Appear 3 times in Top 100, 0 times in Up To, and 2 times in Must Have • RB – Matt Forte – Appear 5 times in Top 100, 19 times in Up To, and 16 times in Must Have • TE – Rob Gronkowski – Appear 8 times in Top 100, 29 times in Up To, and 22 times in Must Have • WR – Doug Baldwin – Appear 97 times in Top 100, 100 times in Up To, and 100 times in Must Have Stats By Lineup Type Notes on the above worksheet: • The complete worksheet can be viewed by clicking on the button at the lower right of the embedded sheet above. • The most important thing to note is the difference in the lineup rankings for each method. For Top 100 the best lineup rank is 1 and the worst is 100, for Up To the best lineup rank is 1 and the worst is 15,965, for Must Have the best lineup rank is 39 and the worst is 16,664. To be clear, in this example when utilizing Up To player caps our 100th lineup is the 15,965th best overall lineup. The aggregate lineup PPG difference between the rankings will be driven by whatever player level calculations you are utilizing and may or may not be substantial. • When applying the actual results for the Week 14 contest to this example we find that for Top 100 73 lineups would have been in the money with the best finish being 171st and total winnings of$384,  for Up To 62 lineups would have been in the money with the best finish being 398th and total winnings of $68, and for Must Have 80 lineups would have been in the money with the best finish being 890th and total winnings of$193.
• A large percentage of the lineups generated in this sample wound up being in the money.  I want to reiterate, that this is by random luck.  The real point here is to see that different lineup generation methods provide substantially different results.

I hope you have found this overview informative.