Jump to content

Your technology needs help... lots of help... lots and LOTS of help. :(


GlowstickSwinger

Recommended Posts

TL;CR Breakdown:

 

Problems

  • Technical needs are lowest priority, which causes the following:
  • Network I/O is abusive Send/read bloat.
  • Network I/O may have blocking operations. Send/Read bloat.
  • Client has blocking operations. Read bloat.
  • Client loading screen isn't always away from the play state. Read bloat.
  • Client physics engine is authoritative regarding movement. Send bloat.
  • The more people cluster together, the more packets and the more data they are putting out, the more your bandwidth costs sore and resources dwindle as you expand.

 

Solutions

  • Don't let marketing run the show... they never have any idea what they are doing. It's Star Wars. It markets itself for God's sake.
  • Set aside time to fix technical debt.
  • Prioritize performance issues over all else until things stabilize, primarily, the network I/O and the asset loading strategy.
  • Reduce your poly count on the client.
  • Back off any 3D engine features for a bit. You're not going to outshine where Xbox was three years ago and it's an MMO. You can get away with some crappy graphics. :D
  • Devs, leads, PMs, and the CTO needs more spinal fortitude to push back against other department priorities instead of being driven to cutting corners and create more technical debt.
  • Make HeroScript execution preserve lexical scope for async callbacks. (LOL good luck with that!)

 

THIS IS WHERE TL;CR ENDS. YOU CAN NOW GO BACK TO YOUR 140 CHARACTER POEMS.

 

To see more about the programmer in question, check him out! http://www.twitter.com/12dcode

 

To see the packet analysis of the network I/O, click on Show!

 

 

 

So, some breakdown on the numbers of why I think that their synchronous network I/O and syncrhonous asset loading on the client is cause of all lag in SW:TOR. I'm comparing WoW (which creates the illusion of collective areas seemlessly) with SW:TOR (which uses isolated instances, the more traditional MQ room compartmentalization)

 

The tool I'm using is SmartSniff on Windows 7.

 

A few observations:

 

  • SW:TOR push down packets according to area of sight. (AOS) Stand near a crowd and you get more, stand away from a crowd, you get less. This active push prioritization would lead to false positives in QA environments simply because they could not properly recreate an environment that would tax the discerning algorithm.
  • WoW appears to not care about where you are or what you are doing, no matter where you are, no matter how many people exist in your AOS, which is interesting, especially considering the dramatically less amount of traffic you send and get from them. WoW data is also surprisingly uniform.
  • The physics engine in the SW:TOR client appears to be authoritative regarding movement. This means that if I jump, the client broadcasts data every single video frame until I land on the ground again. They appear to be trying to run an FPS setup that typically uses UDP on a massive scale using TCP. This is proof positive that SW:TOR network i/o strat is not good and was poorly conceived.
  • They both use TCP.
  • SW:TOR is pulling down data the entire time during any loading screen. WoW only pulls them down at the 90% of its initial loading screen. In many cases, the loading screens are just overlays, not actual states separate from the play state.
  • We have a report of a Midwest player accessing a East Coast server and getting -half- of the data sent in the 10 second time span. Lots of factors can go into that and I'm trying to figure out what that means...

 

Without further ado, here are the numbers.

 

SW:TOR Data usage:

 

Standing in a populated area (Combat Training area) of a single instance of the Imperial Fleet for 10 seconds with 152 people in the instance:

 

Packets: 110

Data Sent: 426 bytes

Data Received: 4,877 Bytes

 

Standing in a sparse area (Far north of the Supplies area) of a single instance of the Imperial Fleet for 10 seconds with 152 people in the instance:

 

Packets: 36

Data Sent: 244 Bytes

Data Received: 1,783 Bytes

 

Standing in an impossible to populate instance (your ship):

 

Packets: 16

Data Sent: 166 Bytes

Data Received: 857 Bytes

 

Jumping.

 

Packets: 12 packets

Data Sent: 307 Bytes

Data Received: 854 Bytes

 

WoW Data usage:

 

Standing in a populated area (near the AH) of the collective instance of Stormwind for 10 seconds with hundreds of people in the zone:

 

Packets: 36

Data Sent: 42 Bytes

Data Received: 1,536 Bytes

 

Standing in a sparse area (far north @ Wollerton Steed) of a single instance of Stormwind for 10 seconds with hundreds of people in the zone:

 

Packets: 33 packets

Data Sent: 191 Bytes (I jumped.. oops)

Data Received: 1,611 Bytes

 

Standing in an impossible to populate instance (Stockades in my own group with only me in it):

 

Packets: 27

Data Sent: 56 bytes

Data Received: 1,190 bytes

 

Jumping.

 

Packets: 5

Data Sent: 163 bytes

Data Received: 463 bytes

 

 

 

To read the drill-down into the HeroEngine network optimization itself, click on Show!

 

 

 

HeroEngine Wiki regarding Network Optimization

 

 

This confirms everything I believe to be happening:

 

  • "Designers still need to architect systems that are asynchronous, parrallelizable, perform caching, lazy evaluation, etc as appropriate (designers think of area server instance as geometry with npcs, shops, quests, etc. As programmers we know an area server instance is a unit of simulation and we have control over spinning up additional processes called areas each of which can be used to run any code we need http://hewiki.heroengine.com/wiki/System_areas.)"
  • "No engine, no programming language, no technology can make a bad N^2 algorithm anything other than a bad N^2 algorithm."
  • "I might pump out more data in movement messages than can be pushed through the a physical server's network interface(s) because of everyone being aware of everyone"
  • "A battle ground area instance where PvP combat events run might only handle a few hundred (though here probably the issue will be more of a client one depending on the topology represented and number of characters a given client simultaneously observes)."
  • "The frequency of the movement updates is also based on distance, as managed by priority and the bandwidth manager"
  • "On the client, each time it starts up, it gets the latest DOM definitions from the Repository. Then, when a character enters an Area, their Client loads .DAT files from the repository which generate a subset version of the GOM -- nodes on the Client that correspond to the Area." (NOTE: These file reads are actually async, but the HeroScript parent operation running that disk I/O is not!)
  • "The HeroMachine (virtual machine in charge of running HeroScript) processes HeroScript synchronously where one instruction must complete its operation before the next one may be started. The synchronous nature of the average HSL script simplifies the learning processes allowing relatively unsophositicated programmers to work comfortably in the environment."

 

See that last point? That's what I saw happening in the game and I called it. I was able to deduce all of this just from the lag that occurs when encountering someone new in an open field. Please stop calling me an "armchair dev" you crazy trolls. It hurts my feeling. >:U

 

Your Client GOMS are a significant part of the problem, BioWare. They are an RPCesque solution, which explains why the client physics engine can act as an authority regarding network i/o. That is -exactly- the kind of setup we had when I was building a game and it was a resource monster. That also, sadly, explains the packet bloat and the difficulty of fixing this problem through software-based initiatives alone. :(

 

3D engines are a solved problems with known or at least expected outcomes, imo. It's the game logic and the RPCs/io where bad practices can take hold and turn all other resources into bottlenecks.

 

 

 

To read the original scathing rant that kick started this whole shebang, click on the SHOW button! Free bunnies!

 

 

 

The Original Post

 

Former game developer here. (No gigs with Lucas, BioWare, or the other companies associated with this... although I did contract for someone who did help with the marketing push lulz) I left the industry because too many non-gamers were entering it. Some thing about the dream of accruing abusive amounts of money from a single title seems to attract the parasites, the wolves, and the incompetent.

 

I like your game. I really do. The focus on the single-player aspect is amazing. The voice acting is from top notch talent (I personally know many of the voice actors in the game, actually) and it is delivered as well as it can be given the -extreme- amount of disconnected copy that had to be read. My kudos to your producers. I'm sure many, many long hours were spent on that aspect of the game alone.

 

Let's move on to the problem. Notice I said “problem” and not “problems”. You only suffer from one problem and all of these ailments stem from it.

 

The Client

 

I've only researched the client packet i/o. I'm assuming you purchased licenses for the 3D engine and that the lead devs on the project had a solid streak of 3D game programming on their resumes.. but only a few of them them had any experience with the engine itself. This shouldn't come as a shock, as there are many, many 3D engines on the market and having one dev knowing them all would price that dev right out of the market entirely. The logic behind my guess is that if the entire dev team was deeply intimate with the engine, then they would have been able to successfully push back against the art and marketing department's insistence on using high poly count models everywhere.

 

Yes, the creativity of these departments are paramount because it's the pretty pixels us consumers froth over. However, if this factor is the foundation of developing a game (It often is the most frequent goto by artists, marketers, and MBAs alike) then it will become obvious in the final product. That being said, your leadership has long crossed over the “golden ratio” between pretty pixels and performance. It is obvious BioWare's entire objective regarding client development is to wage a holy crusade against GPUs everywhere and crush them. It's one thing to write Crysis. It's another to write something that blows up a GPU like Crysis and have absolutely none of the visual effects.

 

I'm sure the politics in your company, the silverback chest-thumping, and the e-peen flopfests all take turns stifling any attempt to rebalance this pixel-vs-performance ratio. (SSDD) When the flaws of your product are broadcasted to the world that the devs are not included in any actual decision making regarding priority or direction, it's time to replace your CTO for being spineless and probably all of his reports.

 

The Servers

 

Your networking infrastructure, however, is not as easily purchasable as an out-of-the-box 3D engine that you just drop in and start hacking away at. It is woefully taxed. I have nothing but pity on your back-end and IT department for one reason: your asset loading strategy is completely untenable in relation to your resource demand.

 

These loading screens between “zones” (See: MQ channels/rooms/subscriptions) are reminiscent to installing an operating system from a pack of floppy drives. You do know that when broadcasting client data to other clients, that shouldn't issue a blocking resource pull from the HD -every- -single- -time-, right?

 

Right?

 

I mean, I like the little pause when on a PvP server because it lets me know that I'm about to be ganked. But I know why that pause is happening. And I know that when you tested it on your local machines that were well-tuned for this sort of testing, and on staging server setups that didn't have even 1% of the load of a live environment, this issue never cropped up.

 

It happens to everyone.

 

Don't fire off blocking calls as a response to time-sensitive events that are pushed down from the server. Your client will fall apart because those events are random. Have a default placeholder asset primed in memory and ready to instantiate until the rest of the local resources (model, textures, etc) are pulled asynchronously.. Spinning plates, man. Spinning plates.

 

This blocking call is only part of the problem. I am almost positive the vast majority of these load screens are caused by the servers trying to poll live information regarding other connections... and that you've made this a blocking call as well. Network I/O is, literally, the most expensive time operation you could perform and you've made a mass polling operation a blocking call. Going from a planet to a warzone to planet is an experience similar to participating in hurdling... except each hurdle is a two hundred foot tall stone wall. This is because you've relied on blocking calls and no one caught it.

 

And no one's going to fix it. It's very easy to undo and yet... it will never be fixed. Why...?

 

...and back to your organizational problems we go.

 

Somewhere, an investor is having ******s (I can say "******s", right? American's aren't still squeamish about that word, are they? Does that put me on a terrorist list or something?) about the revenue and ROI. Yay, capitalism! All is well... except your growth is in jeopardy and both you and I know it. Anyone watching this pure marketing-driven priority of pushing out pointless doodads and trinkets instead of solving fundamental performance problems can spot it a mile away. You have too many non-devs leading the effort.

 

You will not survive your own success because your leadership will not take the ten minutes required to have someone explain to them that people won't play a game where 30% of the time is spent at loading screens due to a failure to prioritize and solve an extremely common problem in an industry-standard fashion. What they will listen to is some non-tech failed actress say that little cute tauntauns will solve these problems. B*tches love little cute tauntauns. (The * is for 'o')

 

Say it with me, BioWare:

 

YOU DO NOT HAVE A CONTENT PROBLEM. YOU HAVE A PERFORMANCE PROBLEM.

YOU DO NOT HAVE A CONTENT PROBLEM. YOU HAVE A PERFORMANCE PROBLEM.

YOU DO NOT HAVE A CONTENT PROBLEM. YOU HAVE A PERFORMANCE PROBLEM.

YOU DO NOT HAVE A CONTENT PROBLEM. YOU HAVE A PERFORMANCE PROBLEM.

YOU DO NOT HAVE A CONTENT PROBLEM. YOU HAVE A PERFORMANCE PROBLEM.

YOU DO NOT HAVE A CONTENT PROBLEM. YOU HAVE A PERFORMANCE PROBLEM.

 

Say that in your next stand-up, loud and proud. E-mail it every fifteen minutes to the CEO. You can fix these problems. It's solvable. You just have to spill some blood and not be afraid of it.

 

 

:mon_trap:

Edited by GlowstickSwinger
Link to comment
Share on other sites

  • Replies 408
  • Created
  • Last Reply

Top Posters In This Topic

This isn't some big secret. Yes for some reason, SW:TOR uses a lot of resources, and works my AMD graphics card to 80C, as if I was playing Skyrim at max settings. But it doesn't have the quality graphics of Skyrim. In fact, Skyrim doesn't even push my graphics card that hard at max settings.

 

The animations are also not that smooth. An example would be if I used an ability on a mount, there would be a brief screen freeze. WoW did not have such problems 7 years ago when I played, so for some reason this engine is clunky despite not having high-end quality graphics.

 

The game is playable now, but I hope in future they will fix all performance problems.

Edited by ConradLionhart
Link to comment
Share on other sites

This isn't some big secret. Yes for some reason, SW:TOR uses a lot of resources, and works my AMD graphics card to 80C, as if I was playing Skyrim at max settings. But it doesn't have the quality graphics of Skyrim. In fact, Skyrim doesn't even push my graphics card that hard at max settings.

 

The animations are also not that smooth. An example would be if I used an ability on a mount, there would be a brief screen freeze. WoW did not have such problems 7 years ago when I played, so for some reason this engine is clunky despite not having high-end quality graphics.

 

The game is playable now, but I hope in future they will fix all performance problems.

 

WoW did have these problems years ago and they solved them by making everything async. Their reliance on blocking calls damn near killed them, too.

 

WoW intentionally uses low poly models to solve most of it's performance problems and resource usage. It's also the most important factor that allows players to move from zone to zone seamlessly. Low poly = less client RAM.

Link to comment
Share on other sites

Well, yea its obviously not a big secret, and probably like you said their biggest problem.

 

I know I dont have a good computer but, i am able to run well WoW with high graphics whitout laging. It takes me some seconds to finish a loading bar.

 

In Swtor, some planets like Corelia literally takes 5 mins to download with everything at very low. So pretty much, when i am on these planets, i cannot Queue for a Wz even if i want to) because its WAY too long to dl the map. This problem actually stopped me from leveling alts.because i love to PVp and lvl at the same time and i am not patient enough to wait 5 mins between each wz.

 

I agree with your post even tho i dont think they are gonna change anything soon.

Edited by sindorella
Link to comment
Share on other sites

I agree with your post even tho i dont think they are gonna change anything soon.

 

They certainly won't fix it. CEOs gonna listen to marketing 17 times out of 10.

 

So screw it, these loading screens are MARKETING OPPORTUNITIES. Think of all the paid advertising you could put up there. Guaranteed eyeballs. Rolling in the dough. That idea is free of charge, BioWare. More ideas will cost you $150k a year + benes.

Link to comment
Share on other sites

Only ever takes me about 5 seconds to get past a loading screen. I have a 3year old GTX295 and spent some additional cash on getting some extra RAM (12GB worth DDR3). I use the RAMdisk software/method. GF uses a 560GTi and only has about 6GB RAM, takes her about 10-15 seconds tops. When you spend 30mins-1hr on most planets at a time, those seconds aren't so bad. Could be a little more annoying for constant WZ loading, but considering you'll have to wait 2mins to start for everyone to get in anyway, it's no big deal.

 

Nice go at pointing out the more technical aspects and the reasons for the performance hit, is much better than most do. I'm just not sure that those seconds mean that much in the grand scheme of things. Would rather other issues were fixed or more things added as time goes on.

Link to comment
Share on other sites

You make good points and you sound like you know what you're talking about on the technical side (can't really say for sure, since I didn't understand more than the basic gist).

 

For me, personally, performance is good. The only annoyances I have are the grass and NPC pop-in (the very high setting introduced in 1.2 didn't help with this) and of course the loading times. This is definitely not so bad that I would define it as "the main problem" of the game (sound issues and the game world seeming a bit dead are more pressing issues for me), but it's a problem and if it can be easily fixed as you say then I really wish they would.

 

I don't think we'll get a dev response here, most likely because they don't want to share and talk about the inner workings of the game/engine, but they might make the excuse of your post not being worded as "constructive criticism" as the reason they don't respond...

Link to comment
Share on other sites

I agree whole heartedly with the OP. The loading times are abominable and the cpu and memory allocation sucks, this is not what you would expect from a 9-figure mmo production, these problems need to be solved and addressed instead of focusing on bribing subscribers with vanity pets, one month free game time and such. The hero engine per se sucks but its even suckier that its problems are ignored.
Link to comment
Share on other sites

Meh, I'm going to enjoy what's left of my subscription then forget TOR.

 

Pre 1.2 - GPU's temp is 86 degrees on the "highest" settings indoor and ~92 degrees in confined areas.

Post 1.2 - 90 degrees on medium settings which I assume are equivalent now that they've added high res textures, and 92-104 degrees GPU and CPU in confined areas.

 

I can't even enjoy the game anymore on the same graphics settings without worrying it'll kill my comp. Not to mention the tripled loading times (enter area -> freeze/program has stopped responding -> Start loading to ~25% -> freeze -> continue loading to ~80% -> program has stopped responding? -> finish loading! -> black screen?).

Link to comment
Share on other sites

Former game developer here. (No gigs with Lucas, BioWare, or the other companies associated with this... although I did contract for someone who did help with the marketing push lulz]) I left the industry because too many non-gamers were entering it. Some thing about the dream of accruing abusive amounts of money from a single title seems to attract the parasites, the wolves, and the incompetent.

 

I like your game. I really do. The focus on the single-player aspect is amazing. The voice acting is from top notch talent (I personally know many of the voice actors in the game, actually) and it is delivered as well as it can be given the -extreme- amount of disconnected copy that had to be read. My kudos to your producers. I'm sure many, many long hours were spent on that aspect of the game alone.

 

Let's move on to the problem. Notice I said “problem” and not “problems”. You only suffer from one problem and all of these ailments stem from it.

 

The Client

 

I haven't researched your tech at all. I'm assuming you purchased licenses for the 3D engine and that the lead devs on the project had a solid streak of 3D game programming on their resumes.. but only a few of them them had any experience with the engine itself. This shouldn't come as a shock, as there are many, many 3D engines on the market and having one dev knowing them all would price that dev right out of the market entirely. The logic behind my guess is that if the entire dev team was deeply intimate with the engine, then they would have been able to successfully push back against the art and marketing department's insistence on using high poly count models everywhere.

 

Yes, the creativity of these departments are paramount because it's the pretty pixels us consumers froth over. However, if this factor is the foundation of developing a game (It often is the most frequent goto by artists, marketers, and MBAs alike) then it will become obvious in the final product. That being said, your leadership has long crossed over the “golden ratio” between pretty pixels and performance. It is obvious BioWare's entire objective regarding client development is to wage a holy crusade against GPUs everywhere and crush them. It's one thing to write Crysis. It's another to write something that blows up a GPU like Crysis and have absolutely none of the visual effects.

 

I'm sure the politics in your company, the silverback chest-thumping, and the e-peen flopfests all take turns stifling any attempt to rebalance this pixel-vs-performance ratio. (SSDD) When the flaws of your product are broadcasted to the world that the devs are not included in any actual decision making regarding priority or direction, it's time to fire your CTO for being spineless and probably all of his reports.

 

The Servers

 

Your networking infrastructure, however, is not as easily purchasable as an out-of-the-box 3D engine that you just drop in and start hacking away at. It is woefully taxed. I have nothing but pity on your back-end and IT department for one reason: your asset loading strategy is completely untenable in relation to your resource demand.

 

These loading screens between “zones” (See: MQ channels/rooms/subscriptions) are reminiscent to installing an operating system from a pack of floppy drives. You do know that when broadcasting client data to other clients, that shouldn't issue a blocking resource pull from the HD -every- -single- -time-, right?

 

Right?

 

I mean, I like the little pause when on a PvP server because it lets me know that I'm about to be ganked. But I know why that pause is happening. And I know that when you tested it on your local machines that were well-tuned for this sort of testing, and on staging server setups that didn't have even 1% of the load of a live environment, this issue never cropped up.

 

It happens to everyone.

 

Don't fire off blocking calls as a response to time-sensitive events that are pushed down from the server. Your client will fall apart because those events are random. Have a default placeholder asset primed in memory and ready to instantiate until the rest of the local resources (model, textures, etc) are pulled asynchronously.. Spinning plates, man. Spinning plates.

 

This blocking call is only part of the problem. I am almost positive the vast majority of these load screens are caused by the servers trying to poll live information regarding other connections... and that you've made this a blocking call as well. Network I/O is, literally, the most expensive time operation you could perform and you've made a mass polling operation a blocking call. Going from a planet to a warzone to planet is an experience similar to participating in hurdling... except each hurdle is a two hundred foot tall stone wall. This is because you've relied on blocking calls and no one caught it.

 

And no one's going to fix it. It's very easy to undo and yet... it will never be fixed. Why...?

 

...and back to your organizational problems we go.

 

Somewhere, an investor is having ******s (I can say "******s", right? American's aren't still squeamish about that word, are they? Does that put me on a terrorist list or something?) about the revenue and ROI. Yay, capitalism! All is well... except your growth is in jeopardy and both you and I know it. Anyone watching this pure marketing-driven priority of pushing out pointless doodads and trinkets instead of solving fundamental performance problems can spot it a mile away. You have too many non-devs leading the effort.

 

You will not survive your own success because your leadership will not take the ten minutes required to have someone explain to them that people won't play a game where 30% of the time is spent at loading screens due to a failure to prioritize and solve an extremely common problem in an industry-standard fashion. What they will listen to is some non-tech failed actress say that little cute tauntauns will solve these problems. B*tches love little cute tauntauns. (The * is for 'o')

 

Say it with me, BioWare:

 

YOU DO NOT HAVE A CONTENT PROBLEM. YOU HAVE A PERFORMANCE PROBLEM.

YOU DO NOT HAVE A CONTENT PROBLEM. YOU HAVE A PERFORMANCE PROBLEM.

YOU DO NOT HAVE A CONTENT PROBLEM. YOU HAVE A PERFORMANCE PROBLEM.

YOU DO NOT HAVE A CONTENT PROBLEM. YOU HAVE A PERFORMANCE PROBLEM.

YOU DO NOT HAVE A CONTENT PROBLEM. YOU HAVE A PERFORMANCE PROBLEM.

YOU DO NOT HAVE A CONTENT PROBLEM. YOU HAVE A PERFORMANCE PROBLEM.

 

Say that in your next stand-up, loud and proud. E-mail it every fifteen minutes to the CEO. You can fix these problems. It's solvable. You just have to spill some blood and not be afraid of it.

 

:mon_trap:

 

 

read the title, read half of the first sentence , then decided ........

 

 

GO MAKE YOUR OWN *********** GAME BRO

Edited by Yvin
edited quote
Link to comment
Share on other sites

I don't have performance problems. Or, at least, the performance problems i do have are far from what i would call 'the main problem' of SWTOR.

 

It is true that on max settigns, the game is putting more stress on the videocard on my primary computer(the good one) than i would expect, but apart from a bit higher GPU fan noise, this does not bother me at all. On the secondary computer(which barely meets the minimum requirements) i am able to run the game on minimum settings without any issues(not even the fan noise - the GPU does not have any).

 

It would be nice to have faster load times (especially considering i have the game installed on an SSD) but that's a relatively minor thing as well.

Link to comment
Share on other sites

~~

My load times were long to begin with, but seems like they've gotten significantly longer after a month or two of not playing. Little hiccups all over the place still suck too (mount/dismount, open character pane, etc).

 

Shall we add in the fact that if i close the game from the task manager, it closes immediately, but if i quit via in-game methods, it stops responding for like 2 minutes before it finally closes?

~~

Edited by Tristik
Link to comment
Share on other sites

I've often wondered why, when loading onto a new planet, I usually have time to get up, make a sandwich, and watch a season of Breaking Bad.

 

 

wow , exaggeration for effect nice......but if that is the worst thing that happens to you on that particular day......

 

your day has been better than 3 billion people

Link to comment
Share on other sites

×
×
  • Create New...