Jump to content


Photo
* * * * - 11 votes
Community Input

Advanced Lobbies



  • Please log in to reply
344 replies to this topic

#201 Foxy

Foxy

    former dev

  • Members
  • PipPipPip
  • 995 posts

Steam Profile

Posted 23 May 2015 - 09:51 AM

I've read the original article. Firstly yes, I'm not a statistician.

 

In the original article, you have pointed out that some systems are bad and the ultimate implementation now works, but offer no working on this as most science papers would (in order to be accepted by the community). You also state that an individual has such a low impact on the team result, which suggests win/loss isn't taken into consideration much. But the point remains, and I feel it's the key one that makes this debate continue on and on. You hide all the details of the implementation including all fields that are taken into consideration.

 

As an outsider, to me this is a random ranking system. I can see every players position, but there are so many variables in flight that I have no idea how to improve my rank. If this is just a system for balancing things, why give the public information on their rank and raise these questions - as soon as you list players, some will want to know how to get higher. No offence to S@GA, but he is currently listed as the 15th best spy, 11th best sniper and 19th best engineer on TF2Center, which I just do not believe. According to your rankings, an ETF2L mid-level sniper is better than Tseini, who came third in ETF2L prem this season. I understand that you don't want people to game the system, I really do. But as it stands right now, people may be playing in perfectly legitimate ways that are not considered at all and likewise the inverse.

 

As examples:

1. Pyro on payload defence. I spend the entire game with homewrecker protecting sentries, and the spy gets our medic a few times. Am I punished as pyro?

2. Pyro on payload defence. I spend the entire game protecting medic, and the spy gets our sentry a few times. Am I punished as pyro?

3. Pyro on payload defence. I spend the entire game roaming with flamethrower racking up damage and kills and their spy continuously gets our medic/engie. Am I punished?

4. Pyro on payload defence. I spend the entire game roaming with flamethrower racking up damage and kills, and their spy is not very good so doesn't get any kills. Am I punished?

5. Pyro on payload defence. I spend the entire game roaming with flamethrower but get no kills, and their spy is not very good so doesn't get any kills. Am I punished?

 

Most of these could be considered valid play styles if the team is aware of the pyro's plans. But by playing a perfectly valid play style, I may be being punished and *have no way of verifying*. You say that to become a better player work on DM, game sense and team work which I completely agree with, but by having a published ranking system players are always going to want to know how to improve.

 

Some further examples:

Scout suiciding on payload attack to push the cart. Am I punished for directly leading my team to victory (through deaths)? Am I rewarded for suiciding stupidly repeatedly?

Engie on payload defence using mini-sentries. I get kills but their team wins as we have no level 3. Am I punished?

Engie on payload defence using level 3s. I get no kills because the opponents are good and avoid the sentry. Am I rewarded?

Engie on payload defence using level 3s. We have a terrible pyro and I continuously get taken out by ubers / spies. Am I punished?

Soldier on anything. I spend all game denying an area getting no kills but no deaths. Am I punished?

Demo on anything. I use sword/shield and do loads of damage. Am I rewarded?

 

My main point is that there is no verification. You don't want people to game the system, but people *do* want to improve and right now they don't know how. If I play well but we lose does that count? If I play badly but we win does that? Do people have a higher rank because they go more aggressive into games, causing the team to lose but their stats to look good?



#202 lafc

lafc

    Member

  • Users
  • PipPip
  • 16 posts

Posted 23 May 2015 - 10:22 AM

it seems like you get mad, when someone criticizes your work. try to prevent that and keep calm, then you can see some errors in your logic, and avoid sounding like someone with a superiority complex. 

 

 
Not even trying to engage with the methodology
like i said method doesnt matter if you use the wrong factors. an easy understandable example: a ranking based on airshots and damage token per minute is nonsense/useless no matter how good the mathematically tehchniques are.
 
(which you don't need any detail to criticise)
that would be poor critizism.
 
not engaging with the results which aren't obfuscated in any way
the result is the ranking. and since you dont make the formular for it public, it is obfuscated.
 
just one line answers
as response to one line. and as you can see one liners can make more sense than a longer text, when it isn't thought out.
 
sticking to a single point
yes, a essentiel point.
 
because apparently they're entitled to a full explanation.
i am as entitled to a full explanation, as you're entitled to not get criticised.
 
 
The only possible use for a ranking is for them to know how it works out what it says about them. It couldn't possibly be used anonymously to alter the environment they play in for the better, it doesn't even cross lafc's mind that there is a potential benefit there - it's just useless (despite there being a several thousand word long article pointing out the benefits). At the risk of speculating to the negative, there's a very good chance he's exactly the kind of player who would ruin his team's chances thinking he's gaming the system when he's just making things worse for everyone including himself, because as far as I can tell he's completely centred on himself.
the only use for a ranking in a game is either how good one player per se does, or how much he does to win [for his team] the game. it's already mentioned that winning makes a big part on the ranking. also considering this you just admitted that your formular isn't good. if the formular is good, it couldn't be exploited, it could only be fulfilled. because "exploiting" a good formular means you play exactly as you should.
 
 
Here's a hint lafc - the ranking contains zero information about how to get better, improve your play or otherwise be classified as a better player. Even if you knew everything about it then for an individual player it's still "useless". Improving your death matching, game sense and team work are what make you a better player. Work on those and win more games and you move up.
knowing the formular and knowing you can't exploit it, contradicts even more to keep it secret.
 


#203 GentlemanJon

GentlemanJon

    Member

  • Members
  • PipPip
  • 17 posts

Posted 23 May 2015 - 11:13 AM

I've read the original article. Firstly yes, I'm not a statistician.

...

My main point is that there is no verification. You don't want people to game the system, but people *do* want to improve and right now they don't know how. If I play well but we lose does that count? If I play badly but we win does that? Do people have a higher rank because they go more aggressive into games, causing the team to lose but their stats to look good?

 

You don't have to be a statistician. The article clearly explains the goal: can lobbies be better balanced, can they be made less predictable, can they allow delineation between skill levels with data that is strictly applicable to the behaviour of the population within those specific games and not an external and potentially irrelevant data set like league play levels.

 

Picking stereotyped models of play and saying these are the appropriate methods and only they will be rewarded is a mistake, as it punishes innovation and freedom. If someone plays in an unorthodox way but is effective they should be rewarded, if they play in an orthodox way but have no impact they shouldn't be rewarded. I have pointed this out numerous times, probably to you, and here I am doing it again. It's irrelevant.

 

The only way to rate the population of the size and diversity of Center is to examine the stats in a large scale study and on a long term trend. That automatically means that the niceties of specific tactics or situational examples cannot be part of the process. You're making assumptions about what you think the system might be measuring and then trying to imagine situations where it hurts instead of helps you individually. The examples show exactly the type of short sighted thinking I alluded to before, the stats of the individual strongly reflect the overall team success. Help the team and you will be rewarded.

 

As for the individual positions of players on the website, you're getting caught up in minutiae - it doesn't matter of a player is a few places higher than another - they're being measured out of thousands. Given there's an inactivity penalty the balancer ignores there are numerous strong players whose true rated strength isn't shown accurately either, this is in the FAQ. You're also not taking into account the differential between behaviour in league play and behaviour on TF2 Center, it makes no difference what level a player is at in the league, it matters what they do in games on this site.

 

What tends to happen is that mid to high mid competitve players try hard in lobbies and pickups up to a point, then leave to play mixes in high level competitive groups. That means that the mid to high mid players tend to be highest rated in lobbies because they tend to have the best mixture of effort and skill. Exceptions might be players like Byte who rarely plays at less than maximum, or talented players who's relaxed 30 minutes is still better than everyone else.

 

People know exactly how to improve. Grind DM until you hit every shot, read guides, watch demos, improve your gamesense and situational awareness, use comms appropriately. Get better at those things and your rank will improve, I use an abstract model that seems to effectively reflect that.


 

it seems like you get mad, when someone criticizes your work. try to prevent that and keep calm, then you can see some errors in your logic, and avoid sounding like someone with a superiority complex.
 

 

Nope, I find useful criticism a pleasure. You obviously haven't read (or at least understood) the covering article or probably the site FAQ because your answers don't make any sense in the context of what has been explained. You also appear to be attempting some kind of sophistry in something that is probably your second language, good luck with that.

 

The purpose is to make it possible to balance lobbies more effectively with an independent data set, make them less predictable, make it possible to group people by broad skill groups. No claims are made that this is 100% accurate. Simples.


Edited by GentlemanJon, 23 May 2015 - 11:15 AM.

  • MasterNoob likes this

#204 Foxy

Foxy

    former dev

  • Members
  • PipPipPip
  • 995 posts

Steam Profile

Posted 23 May 2015 - 12:12 PM

 

If someone plays in an unorthodox way but is effective they should be rewarded, if they play in an orthodox way but have no impact they shouldn't be rewarded. I have pointed this out numerous times, probably to you, and here I am doing it again. 

 

Effective to who? Yourself? The team? Other members of your team? Effective how? In winning the game? In racking up kills for themselves? What if the orthodox way is helping your team but isn't shown in stats?

 

 

You're making assumptions about what you think the system might be measuring and then trying to imagine situations where it hurts instead of helps you individually. 

 

Yes I am, because this is all I have to go on. I don't know what the system is measuring. 

 

--

I'm aware that a lot of this is hypothetical at the moment, and the proof will be in the pudding as they say. But the situation we're in now means that people have a chance to see/judge the results via the playerrankings website with little to no information on how it's calculated, or how to improve there.



#205 lafc

lafc

    Member

  • Users
  • PipPip
  • 16 posts

Posted 23 May 2015 - 12:13 PM

You obviously haven't read (or at least understood) the covering article or probably the site FAQ because your answers don't make any sense in the context of what has been explained.

we are argueing about dota.

see? simply claiming things doesn't make them true.

it's nowhere explained what and how factors from the match itself are used. factors are the numbers which you fill in your formular so it gives an acutal result.

 

 

The purpose is to make it possible to balance lobbies more effectively with an independent data set, make them less predictable, make it possible to group people by broad skill groups. No claims are made that this is 100% accurate. Simples.

i have never said that it's inaccurate. i've said that it's useless. if you keep the complete mechanics hidden (paragraph above), a ranking only based on kills has more validity than yours, simply because people can't know what it's based on.



#206 quintosh

quintosh

    Member

  • Users
  • PipPip
  • 29 posts

Posted 23 May 2015 - 12:19 PM

in pug2 anyone who played div2 and above could play, people that played div2 in the most recent season were given a tryout over a week/month - worked pretty well



#207 GentlemanJon

GentlemanJon

    Member

  • Members
  • PipPip
  • 17 posts

Posted 23 May 2015 - 12:44 PM



i have never said that it's inaccurate. i've said that it's useless. if you keep the complete mechanics hidden (paragraph above), a ranking only based on kills has more validity than yours, simply because people can't know what it's based on.

 

 

How can a ranking system be both accurate and useless? I'm going to stop replying to you now, nothing you say makes any sense to me. I'm sure it makes sense in your head but whatever point you're trying to make just isn't coming across. Maybe I'm too thick to get it.


in pug2 anyone who played div2 and above could play, people that played div2 in the most recent season were given a tryout over a week/month - worked pretty well

 

If you want to personally trial tens of thousands of players to establish their skill and review their progress over time, be my guest.


 

Effective to who? Yourself? The team? Other members of your team? Effective how? In winning the game? In racking up kills for themselves? What if the orthodox way is helping your team but isn't shown in stats?

 

 

You see this is the problem, to me it's not personal, there is no prejudice whatsoever in my approach. I have tried a lot of novel pet stats that I thought would be good but they weren't. I discarded them. I tried a lot of stats that were suggested to me that other people wanted to work out, the results weren't good so to their personal disappointment I had to discard them.

 

And this is the other problem; I'm dealing with things that fall out of linear regressions and factor analysis that work when simulated (not everything does). I don't know what they "mean", I just know that when simulated in certain ways they work. I can speculate, but what do you want my opinion for? I'm just the curator of statistics that certain methods produce. There's no big idea of mine in there (well, I've come up with a couple of tweaks I'm quietly pleased with but those are just technical satisfactions) that has me thinking "my vision of the game is justified".

 

I've applied it against league games and it matches winners, and identifies star players, and there aren't many disagreements between anecdotal assessments and what the system finds, and when there are I think there's usually a very good case to be made that anecdotal is flawed.

 

As for "what if you're doing well but it doesn't show up in the stats", maybe the regressions haven't spotted it? They're not perfect, and corner cases tend to be discarded by the technique. I simply don't know how you'd simulate it, the only thing you've got to assess thousands and thousands of players are stats. You can't classify them individually.

 

Quintosh mentions pickup 2, what he doesn't mention is that there were strict limits on classes people can play. League level played data doesn't tell you about that and one thing I'm confident of is that skill on particular classes can radically differ, and of course a huge section of the Center player base has no league record and some of them are actually pretty good. They deserve recognition.


Edited by GentlemanJon, 23 May 2015 - 01:11 PM.


#208 lafc

lafc

    Member

  • Users
  • PipPip
  • 16 posts

Posted 23 May 2015 - 01:12 PM

How can a ranking system be both accurate and useless? I'm going to stop replying to you now, nothing you say makes any sense to me. I'm sure it makes sense in your head but whatever point you're trying to make just isn't coming across. Maybe I'm too thick to get it.

i've explained this: i don't say it's inaccurate, because i can't know, which is the point. maybe its accurate, but nobody can know that it is, without the relevant information given.

a ranking is only as useful as people want it to be (at least when its not tied to something like sorting people into lobby types, which isn't the case (at the moment)). and why should someone care what the ranking says, if he doesn't know anything relevant about it?



#209 fraac

fraac

    Advanced Member

  • Users
  • PipPipPip
  • 144 posts

Posted 23 May 2015 - 05:04 PM

Which anecdotal cases are flawed? (Who's overrated?)

 

Foxy, how are you an "outsider" when you're staff?



#210 quintosh

quintosh

    Member

  • Users
  • PipPip
  • 29 posts

Posted 23 May 2015 - 05:28 PM

GentlemanJon:

 

you dont have to personally "establish their skill and review their progress over time", some players just shine out by being communicative, some don't, people that are good are granted permanent access to advanced lobbies, others should probably try again after next season. I'm sure there's plenty established community members including TF2C staff capable of judging if a player fits or not.



#211 b33p

b33p

    Advanced Member

  • Members
  • PipPipPip
  • 156 posts

Posted 23 May 2015 - 06:03 PM

The rankings are up the left, no offence to the incredible amount of work GJ or anybody has put into them. Even classes such as scout and soldier, which you'd expect to be the easiest to measure, couldn't be measured accurately. When you get to classes like pyro and engineer, you can forget about it. No ranking system is going to be able to accurately rank these classes and, indeed, the only way for these classes to stick out statistically is with a high K/D and if you know anything about highlander, there's qualitative benefits being missed by leaps and bounds.

 

in pug2 anyone who played div2 and above could play, people that played div2 in the most recent season were given a tryout over a week/month - worked pretty well

 

I have to agree with my ex-teammate and best friend Quintosh on this one. Let plays who have played any matches in div 2 and above in and randomly shuffle teams.

 

/thread



#212 Saga

Saga

    Advanced Member

  • Users
  • PipPipPip
  • 42 posts

Posted 23 May 2015 - 06:36 PM

As an outsider, to me this is a random ranking system. I can see every players position, but there are so many variables in flight that I have no idea how to improve my rank. If this is just a system for balancing things, why give the public information on their rank and raise these questions - as soon as you list players, some will want to know how to get higher. No offence to S@GA, but he is currently listed as the 15th best spy, 11th best sniper and 19th best engineer on TF2Center, which I just do not believe. According to your rankings, an ETF2L mid-level sniper is better than Tseini, who came third in ETF2L prem this season. I understand that you don't want people to game the system, I really do. But as it stands right now, people may be playing in perfectly legitimate ways that are not considered at all and likewise the inverse.

 

Ouch ;(



#213 Mother Tereza

Mother Tereza

    Developer

  • Members
  • PipPipPipPipPip
  • 1714 posts
  • LocationRussia, Moscow

Steam Profile

Posted 23 May 2015 - 07:29 PM

Let plays who have played any matches in div 2 and above in and randomly shuffle teams.

 

O-ho-ho-ho-ho, why do you need TF2C then? No offence, you already can play exclusive private "div2" parties with approximately 200 members in total by using mixes spreadsheet or whatever else you are using to organise PUGs. This had been already said before, TF2C's playerbase is over 100k players. Potential candidates for ALs among them around 5-10% (this includes EU + NA). Guess, how much of them is div2 and higher?

 

ALs are experiment. Firstly we want to see how popular they will become. If there will be enough requests for ALs per day, we could keep develop ALs conception further by adding div restrictions.



#214 GentlemanJon

GentlemanJon

    Member

  • Members
  • PipPip
  • 17 posts

Posted 23 May 2015 - 09:44 PM

GentlemanJon:

 

you dont have to personally "establish their skill and review their progress over time", some players just shine out by being communicative, some don't, people that are good are granted permanent access to advanced lobbies, others should probably try again after next season. I'm sure there's plenty established community members including TF2C staff capable of judging if a player fits or not.

 

If you're talking about pickup2 you do need to personally establish everything. The bot had/has a curated database of rankings of players on each class. It used these ranking to restrict classes people can play and produce balanced games. In terms of existing pickup balancing technologies it's the Rolls Royce option, but it's totally impractical for the entire TF2 Center population. That's why my project exists, it's for everybody.

 

It's interesting to note my research indicates that where my system and the pickup2 system agree that a game is balanced it doubles it's effectiveness in producing close games, which would indicate that the best possible world is a system where human judgement and objective stats are combined.

 

Additionally there's no reason for TF2C to devote time to serve the needs of a miniscule section of it's player base. As pointed out above when they get to that level improving players destined for high levels tend to stop playing lobbies seriously about the time they get to enter the existing elite mix groups. There's no reason why they can't continue to be organised manually, the number of players can be handled by individual discretion at that point.

 

tldr - it ain't for small elite groups


... the only way for these classes to stick out statistically is with a high K/D ...

 

I'll just repeat myself on this subject for the hundredth time, K/D isn't a statistic I have found useful. This is Reddit level criticism. Also pickup2 was not a random shuffle.


 

Ouch ;(

 

 

Really? I saw War insisting on taking an uber while playing scout wielding the Boston Basher the other day. He should be rated high though right? Prem players who aren't playing seriously should not and will not rank well, and rightly so because if you created a team expecting prem quality you wouldn't get it.

 

As explained above, largely in response to this specific point, players trying their hardest are usually up and coming and stop being able to learn and improve at TF2C or Pickup around the time they enter the potential sphere of the top level mix groups. I've seen it happen several times already that up and coming players arrive at the top of the league a couple of months after hitting the top of the rankings, which by then have started to decline.

 

There are Engies who use lobbies to try experimental stuff and weird positions, most of which fail horribly. Their rankings are bad, but if they tried seriously they would be good. It's better to record sub optimal play for what it is because players will only repeat it.There are hundreds of examples on every class, the point of the system isn't to artificially recreate the league standings. It's to catalog the way people play in a neutral manner, warts and all, in this specific environment.

 

The fact is it's a different place to the league and people play differently. There is also a certain amount of chaotic fluctuation, not everybody plays everyone else. These are simply the problems inherent in ranking 4,230,526 player performances. Of course most of this is pointed out on my site's FAQ.

 

Also, this is the same Tseini whose stream title is currently "Drunken game shenanigans". He must be giving it everything


Which anecdotal cases are flawed? (Who's overrated?)

 

Biggest one was probably Ryb, partly caused by his team baiting him out a bit but there was a groundswell of "isn't Ryb fantastic!" opinion one season, like it was his turn for a lifetime achievement award. He's playing exactly the same this season, same stats, same approach, same Ryb and it's "isn't Ryb doing poorly". Politics mate, and sticky nerfs.

 


Edited by GentlemanJon, 23 May 2015 - 09:45 PM.


#215 Saga

Saga

    Advanced Member

  • Users
  • PipPipPip
  • 42 posts

Posted 23 May 2015 - 10:38 PM

 

 

Ouch ;(

 

 

Really? I saw War insisting on taking an uber while playing scout wielding the Boston Basher the other day. He should be rated high though right? Prem players who aren't playing seriously should not and will not rank well, and rightly so because if you created a team expecting prem quality you wouldn't get it.

 

Nay Jon, I said "Ouch ;(" cause Foxy was bashing on my mad skillz. I am that S@GA ;o


Edited by Saga, 23 May 2015 - 10:38 PM.


#216 GentlemanJon

GentlemanJon

    Member

  • Members
  • PipPip
  • 17 posts

Posted 23 May 2015 - 11:12 PM

 

 

 

Nay Jon, I said "Ouch ;(" cause Foxy was bashing on my mad skillz. I am that S@GA ;o

 

 

Oops, sorry for the rant. Doesn't seem out of bounds that a player with 2 seasons in Div 1 try harding for a couple of weeks on Center shouldn't rate well. Theoretically there shouldn't be that many players on your class(es) that have played higher, and even fewer who are playing seriously in lobbies. There's just no respect for talent applied well.


Edited by GentlemanJon, 23 May 2015 - 11:12 PM.


#217 naknak

naknak

    Advanced Member

  • Users
  • PipPipPip
  • 175 posts

Posted 23 May 2015 - 11:24 PM

Keeping the algo a secret doesn't prevent gaming the rankings. Anyone can look at the top players, review their tens of games played, and figure out a workable min-max from there.

I'll just repeat myself on this subject for the hundredth time, K/D isn't a statistic I have found useful.

You've touted the predictive capacity of your approach. I think it would be useful to compare it to useless statistics, like K/D, or to simple statistics whose usefulness you haven't commented on, like map-normalized DPM and win percentage.

"In a sample of 10000 matches, historical win percentage predicts the outcome correctly X% of the time, and TPR predicts correctly Y% of the time."

#218 GentlemanJon

GentlemanJon

    Member

  • Members
  • PipPip
  • 17 posts

Posted 23 May 2015 - 11:39 PM

DPM

 

Another questionable stat. The predictive capability isn't much better than div balancing (although I think it is by a modest margin), hence the lack of likely gains for TF2 Pickup, but it will apply to any player of any background which is a big advantage for Center. I also think that arriving at something as effective as comparing league achievements through purely abstract means is quite good, might just be me though.

 

K/D was a total car crash in terms of prediction and balance. I'm not sure how many of the original figures I've got to hand, I may take a look through but much of what you suggest would be original research that I'm unlikely to have time for.

 

The interesting thing about DPM is that as yours goes up, so does your opponent's.


Edited by GentlemanJon, 23 May 2015 - 11:40 PM.


#219 Luop90

Luop90

    >implying I have a title

  • +Admins
  • PipPipPipPipPip
  • 1919 posts
  • Location127.0.0.1

Steam Profile

Posted 24 May 2015 - 03:40 AM

@GJ

 

How would the ranking system balance out a lobby when the majority (to use HL as an example, 17 players), are roughly the same in the ranking, (which by itself would probably be a fairly balanced lobby), and then there was one player who was much, much higher in the rankings (say, in the top 10), in the class that they're playing (such as sniper). Would it be able to compensate for such a large deviation from the average mean?


Why do mathematicians confuse Halloween and Christmas? Because 31 Oct = 25 Dec!

#220 lafc

lafc

    Member

  • Users
  • PipPip
  • 16 posts

Posted 24 May 2015 - 02:20 PM

Keeping the algo a secret doesn't prevent gaming the rankings. Anyone can look at the top players, review their tens of games played, and figure out a workable min-max from there.

 

you also don't have to use tf2c to play. just ask every player you know if he wants to play and then you get a game - maybe.







Also tagged with one or more of these keywords: Community Input