I know this came up a few weeks ago, but was it ever resolved? The page load times seem to be getting slower and slower.
I can't imagine that you need a super high horsepower server or connection to run this site... Just playing armchair system admininstrator from the sidelines it seems some database tuning may be needed. I'd be happy to contribute to a fund for a faster server or some DB consultants time or... ?
-Steve
I have noticed it, seems to fly when there aren't so many user's on-line. Like now...
yes. I've noticed -for the last 2/3 weeks
I notice it more after I have been to the shop talk forums, that place as really fast but that's probably because Jake runs it (I think).
dr
Yes, it has been noted and the admins say it not the site, but I have a routine of about 10 sites I check every morning. 914club always has big delays in loading the pages that I never experience at the other sites.
It is really annoying, and absolutely restricted to only this site. I often open another browser and surf another site while waiting for a page to come up.
I load five different sites in tabs as my "homepage". The 914club is always the last to come up.
I don't think that the problem is with bandwidth or server response. From my cursory investigation, those seem adequate. Has the database been optimized using the Invision mySQL Toolbox? (not trying to be a prick, just trying to help)
I was seeing the slowness a few weeks ago but now it's pretty fast for me. Try a traceroute to www.914world.com and see what it turns up.
I aint no computer wiz but, I regularally check from 3 different computers, all on high speed, all of which are new Pentium 4's and are very fast in general, but the 914 club site is always slow....
I just pushed the "back" button from this thread and it took 40 seconds to respond. Came back to this thread to make this post and it took 20 seconds load.
Ken
The page loads have been slow for me also (and I'm only 10hops or so from the server) I do think it is a database issue. Each page is generated.. The box appears to be fine and the connection is fine.
The MYSQL hasnt been touched as far as I know since it was configured and installed.
B
One reason is that it reloads everytime you go back or click on a new page....most sites will not post the newest data when you hit BACK....this one does.
I am noticing the slowness too. This morning the 914 Tech BBS took so long I just gave up (and we are using T1 at my office).
With dial up at home the site seems fine, but some times it's incredibly slow with the T-1.
Ben
> i've been thinking about moving the site back onto one of my compaq boxes that has dual CPU, 4 Gig RAM, raid-array, blah blah. right now, the current box and the fairly large mySql DB could be a bottleneck.
Again, I really don't know what I'm talking about so I shouldn't be offering sys admin advice, but I doubt it's the hardware. Anything better than a P2 should run this site fine, it's probably how the db is configured. Might need an index on a table somewhere or something...
I'd offer to have one of our guys at work look at it, but we're primarly a Microsoft shop, so we'd be fumbling around a lot. It's probably a 2 hour project for the person that knows what they are doing.
-Steve
I knew Andy was in trouble when he said that... I have stood next to OC48 and OC12 cabinets... the number of T1's splitting off that cabinet require a physical pipe the size of his arm...LOL
I used to do T1 card testing in those cabinets...
B
I've actually thought about this for a long time, and here's my educated guess:
It takes a long time for the BBS to calculate how many pages belong in a thread.
That "Member 914 Pictures" thread now has about 80 pages. It has about 1600 replies and my guess is that everytime you load that forum or redisplay that page it's going to go and calculate how many pages it will take you to view all 1600 of them.
On other BBS's they kill threads (make it so you can't post to them) and start a new one after they are so many pages long, and I'm guessing it's for this very reason. I've done a lot of work with database reporting and one of the biggest performance killers for a report is displaying "Page X of Y" in the footer, since it has retrieve all the records for the report and calculate how many pages long the report will be.
Is there an option so that the number of pages in a thread are not displayed? Maybe move that thread to it's own section, or create a whole new section for member pictures?
It just gives me something to look forward to if it dosn't load right away.
I've suspected the DB for some time now. But I'm no DBA...
I've temporarily moved the "Members 914 pictures" and the "What the heck do you look like?" threads to the Classic Message Threads forum. Let's see what happens now. There is an option to "split" threads. If this temp move helps, maybe we can try that with the biggies.
It's moving even slower for me now than it was before.
OK. I moved the monster threads, backed up the db and rebooted the box. Let's see how it works out.
Really slow today again... What else can be tried to speed things up?
-Steve
As
long
as
it
loads
at
all,
I'm
okay
with
it!
Look!
it's
going
faster now! It really seems okay if you have something to sip while you wait
Watching the 'top' statistics shows that the CPU is regularly maxxed out. As well, the physical memory seems to be maxxed out and 125 mb of swap space is being used. Sounds like the box is being tapped out.
125MBs of swap isn't a lot. How much memory in the box?
Looks like 256 mb. The load average was up to 5 a few minutes ago with 150 mb of swap being used.
On Linux, using ANY swap kills performance. This goes double if it's a 2.2 or early 2.4 kernel. Linux is very aggressive about keeping stuff in RAM, and it performs very badly (relative to, say, Solaris) when forced to swap. More than any other flavor of Un*x, adding memory to a Linux box is a big win.
Is this thing all in PHP? You running Zend?
If any of the admins (Andy) are going to be at the breakfast tomorrow, I'd be happy to chat about the setup. I've run some very high traffic sites on a shoestring before (> 1M/day on < $1000). PHP and MySQL. Or just email me.
Anything would help.
THe thing that bothers me is that the slow down happened rather suddenly several months ago. It was all fine and dandy... and then something happened.
It wasn't like it was a gradual decline in performance.
I think something was tweaked wrong or some code is unnecessarily running some loops it should be doing.
My suggestion is that we buy something like this or better:
http://cgi.ebay.com/ws/eBayISAPI.dll?ViewItem&category=51227&item=5717352418&rd=1
Since the ecommerce site is close at hand we could do a limited edition club tshirt for say $30/each instead of the usual $20/each. I'm sure the pent up demand for t-shirts is huge.
500 tshirts x $10 = $5000
Andy,
is there a way to turn off Avatars at the server?
Based on your Avatar post, there seem to be quite a few large pics being used as avatars.
So either the server is spending time shrinking them before sending, or is sending the whole pic and letting the browser shrink. Either way, its takes time.
(or I have no idea what I am talking about )
I have my personal settings to not show avatars, and the pages generally load within 2 to 5 seconds.
I'll secure more ram for the box this week. The question will be WHEN we can install it.
B
Seems a every so slightly better right now...
Did you add more memory?
Maybe no one's logged on...
it's normal speed at "OFF" hours. I think we're tapping out the resources of the box when we have 100 people checking it out.
I'm trying to talk the guy who built the box for me 5 years ago to "spot" us some ram. He is a Corvette car guy... I asked him to set aside 2 512 sticks for us "if" the motherboard will take it. Worst case scenerio it gets 4 256 sticks.
B
Technically speaking the bits move just as fast through an "OC48" as they do through a "T1" - an OC48 can just move more of them at the same time (and don't say light is faster than copper because while that is true - Telecom doesn't use light for "speed" they use if for "width").
:finger2:
This is fun.
Now...why is it slow? Is it software (db, os, etc?) or do we need to throw some hardware at it and take another collection?
If so - I'm in to throw some cash at it if that's what it needs but if it's software - I'll be glad to help with that too.
Maybe someone should check the server's ground strap? I've had that can cause slow cranking.
-Steve
I had to bail out for a couple of hours earlier.
SSLLLUUUUGGGISSSHHHH.
It's OK now.
KT
Let me let you guy's in on a little secret:
We know there is an issue.
It will get taken care of.
Constant reminders do nothing but PISS OFF everyone involved with the site.
Remember... this is ALL FREE. We pay for NOTHING.
Hang in there... or go away.
B
Well Sir Brad... the first step is admittance. Thank you for fessing up... ha ha!
Up until now, it's all been denial. (except for the recent thoughts on memory and swap disk).
Andy had been insisting it was a routing issue from the east coast to the west coast.
If I understand correctly, there have been offers to help from non admins.
Heck, I may even have a stick of 512 MB memory laying around I could offer up.
Let us all know how to help (besides not acting concerned).
And as always.... thanks!
Regards,
Qarl
Problem started when you swapped from Red to Orange Font....prior to that all was good... Us Tech types always go sherlock holmes..... Throwing memory at it may/may not solve.... a little troubleshooting will uncover root of the problem... Thanks for looking into ... I have broadband, so not too bad, but dial-up guys must be real patient!
can we quantify slow?
is it 2 seconds or 20 seconds?
Is it every page? including the Home page?
I had a problem when this thread first came up. It was taking 30 - 50 seonds just to load the Home page. Finally, after checking with neighbors who are using the same cable service (Time Warner) and were not having trouble loading this page, I decided my IP port was hosed some how. I shut down overnight and got a new IP address, now most pages load for me in 2 - 5 seconds.
But it does sound a little low on memory for this type of board and the number of simultaneous users.
I has nothing to do with your computer. It has everything to do with the fact that the box the board is running on is likely getting overtaxed at peak times.
Andy, I disagree it's a network problem. If I pull up several sites at once (nice thing, tabbed browsers), all of them come up 2-3x quicker than 914world.com. Doesn't matter what the sites are (well, within reason). I have no routing problems to the colo that I've noticed, either from home over my lousy ISDN connection through SBC or at work through a T1 hooked straight to Alternet.
The very fact that site performance changes based on the time of day (it's slower early in the day and later afternoons, when most people appear to be "on"), and the fact that Mark indicated the box itself was swapping, tells me there's a simple lack of memory on the box. If this were a mostly static site, I'd agree that the current box should be plenty for the level of traffic it sees. But since it's entirely dynamic, it's very likely showing some stress at peak times. 256MB is really not a lot of memory for something trying to service 50-100 simultaneous users on a fully dynamic site, particularly when you're running the DB server on the same box. If you're at all interested, I can send you some basic tests that can be done at peak times (remotely!) to see if I'm right.
I'm not complaining at all, btw. I find the site performance to be acceptable. I've been the ONLY sysadmin at much bigger sites before while doing other jobs, so I know how hard it is to get around to doing anything that requires me to physically visit the box and do something to it.
Complaining? EVERYONE HERE SUCKS! (except for Aaron, he blows) :finger2:
The club sometimes times-out when I try to send a PM, or search, but I don't mind, I didn't want to say anything to you assholes anyhow!
M
5 out of 4 doctors agree, Crest tastes like crap, but it actually cleans your teeth, because it's like jewlers rouge, and Close-Up is like jello, with a very medical tasting cinnamon kind of flavor that kids like.
Where am I?
Attached image(s)
Andy, didn't Marc say say that the cpu was maxed and that 100% of swap was used? How can you say it's not a hardware issue?
Nobody commented on my idea of a t-shirt run to fund new servers. Everybody pays $5 extra or whatever is needed and we have new servers.
Again, I maintain that the slow down occurred rather suddenly around the same time as some of the layout changes and other changes to the board occurred...
If it were a gradual slowdown due to slow growth, most of us wouldn't have noticed it.
But what do I know... I'm an idiot...
ping and traceroute to www.verilegal.com:
ping is a steady 30, traceroute has no major hickups, site runs fine
Attached thumbnail(s)
ping and traceroute to www.914world.com:
ping is a steady 29, traceroute has no major hickups, site runs fine
Attached thumbnail(s)
Andy, Marc said that the cpu was maxed and 100% of swap space was being used. Is that not true?
I don't see the point of doing pings and traceroutes if the problem is cpu and memory.
I agree. Getting to the server is not the problem. Something is occuring after you arrive.
I can go to http://914world.com and the page loads rather quickly. However, go to http://www.914world.com/bbs2/index.php?act=SF&f=2 and the system is dogged slow.
The first URL loads a pretty simple php-derived page. The second URL interacts with the database.
I would tend to agree that it is most likely the swap issue.
Here you go...
Attached image(s)
here's a shot of the system monitor ...
as you can see the monitor itself and VNC eat up most of the CPU, but obviously only when i'm remotely connected and run the monitor.
the same is true for the CPU bar-chart. it's almost maxed out when i look at it but when i minimize it and come back later, it's fine except a occasional spike.
the monitor itself eat's up 1/4 of the CPU!!!!
however, the machine will clearly benefit from more RAM, the physical ram is almost maxed out ...
so that will be the next step ...
<_< Andy
Attached thumbnail(s)
Andy,
I pinged the box at 10 second intervals for five minutes. The results are below.
I got responses between 82 and 96, with 96 being the norm. Oh, I'm on the East coast.
Attached thumbnail(s)
Yes. It's running slow. About 20 to 25 seconds to load a page on the forum.
Not surprisingly, it loads slightly faster if I am not logged in. (by about 5 to 10 seconds)
ok, so your ping is steady (altough on the high side) but the site is slow. as is jeroen's connection ...
one last thing
can the two of you post a traceroute to the club server?
thanks!
Andy
Here's my tracert
Attached image(s)
Whoa. I think this is a simple case of Andy's juggling too many things at once, feeling harassed, Anthony misspelled Mark's name as Marc, and general confusion...No need for everyone to get their knickers in a twist.
Andy, I did a traceroute from my home ISDN connection, and everything looks fine to me. 20-50ms times, which is basically about what I'd expect to see over this connection to anywhere. Someone's reverse DNS is choked up, as I halt somewhere inside cogentco unless I used -n to turn off name lookups (209.17.64.166 is the address it's choking on). I'm 12 hops from the server here, and the last hop has roughly the same ping time as the first hop. Looks pretty clean to me. I get essentially the same results with verilegal. Indeed, pretty much the same ping times to www.yahoo.com (lord knows where that actually goes).
Do this for me: while logged in to the box itself, do 'vmstat 5' and let it run for a minute or so (10 - 12 lines). Ignore the first line. If you see anything other than 0 in the si or so columns, it's swapping. If it is swapping, it's badly in need of memory.
James,
That's good info, but it's not in the proper spirit of things around here. You forgot to give them the finger.
:finger2:
I only have a problem with people telling me I don't know anything about what I'm talking about. That's bullshit.
procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
0 0 0 93480 9932 7256 129464 6 35 121 174 196 130 36 9 55
0 0 0 93480 6912 7276 129484 0 14 7 44 144 104 22 3 75
0 0 0 93492 6916 7284 129496 2 43 2 83 180 70 18 5 77
0 0 0 93492 6844 7292 129492 0 14 0 27 121 45 0 1 99
1 0 0 93576 3424 7296 129252 0 42 8 420 167 112 69 17 15
0 0 1 94360 3616 7340 127296 14 182 264 282 518 275 44 11 45
0 0 0 94928 3552 7276 125468 4 114 6 177 230 128 20 3 77
0 0 0 95092 3692 7252 125520 3 65 7 76 124 49 0 1 99
0 0 0 95392 3660 7248 125048 0 72 7 420 145 83 70 16 14
2 0 0 95392 4020 7272 125284 5 0 54 42 220 170 31 4 66
0 0 0 95392 4008 7284 125296 2 0 6 52 234 102 8 2 90
0 0 0 95392 4016 7292 125296 0 0 0 33 149 70 23 5 73
0 0 0 95392 4180 7300 124768 2 0 14 412 142 70 65 14 21
0 0 0 95392 3904 7320 124800 0 10 10 38 167 102 16 3 81
0 0 0 95392 3916 7332 124800 9 6 10 476 253 128 64 16 19
1 0 0 95496 3928 7344 124848 69 33 77 96 257 191 38 6 56
1 0 0 95600 3680 7344 124560 0 43 0 133 271 205 38 3 59
0 0 0 95600 4224 7352 123216 0 0 1 399 180 96 75 18 7
I notice that the traceroute is taking a different route across the country. Last time I provided a traceroute (about a 4-6 weeks ago, on another thread), there was a different router somewhere that we thought was causing all the hangups.
Oh... and one other point brought up a couple of posts back, and I think this is important in helping diagnose issues...
I too can load the 914club home page really quickly... which supports that fact that a ping or traceroute is going to respond quickly.
It's the loading of the technical BBS.... err I mean, forum that crawls. And I know this is where all the dynamic mumbo jumbo crazy magic stuff happens...
So to repeat myself, and several others... accessing the homepage is quick, dynamic pages is worse, dynamic pages at peak times is worser.... accessing pages and replying with 100 users is the mostest worstest.
Oh shit, I almost forgot...
:finger2: :finger2:
OK, that vmstat dump (thanks, Mark), shows:
there aren't many processes waiting to do anything (first column is mostly 0)
the CPU is reasonably busy (20-50% idle)
it's swapping some (mostly single digit numbers in si so, but some double digits)
relatively little disk activity (bi bo numbers, anything under 400 regularly is low)
We're not really correlating this to board "slowness" (seems fine to me right now), but I'm sticking by my guns on this. If one of the admins can repeat the vmstat exactly when the board is definitely running slowly, we'll see what it's like then. If it regularly goes into double digits on si/so, it's hurting. Disk and CPU don't appear to be a bottleneck, at least from what I've seen so far. The fact that the CPU isn't idle yet it's not running user processes means it's busy doing housekeeping tasks (like swapping).
Oh, and :finger2:
it's not a network issue. you can see you get a fast reponse when setting up the tcp connection with your http request, but a long delay waiting for content ("waiting for www.914world.com"). when the content is ready, it loads quickly. if it were packet loss, congested network, etc you would get a slow loading page more so than a slowing responding page. someone should run tcpdump on the server and watch the connections being made, might give some more clues.
I don't work with Apache, but does it use bandwidth throttling? Seems I remember similar circumstances with a *nix box on a network at Sprint (the box was on a 1GB ethernet switch that connected to a couple OC3's). Engineers sat around scratching their heads and throwing parts at it (upgraded CPU/more ram, new NICs, etc) until they found that "someone" had set the max throughput per session to some unrealistic setting in an effort to "tune" the server.
Do you have any large files people could download from the server to see what the avg. KB/S throughput they're getting is? Most broadband connections get 400-500 (or higher) where something in the 130-150 range usually indicates a saturated T1 or some time of restriction taking place.
(Not bitching at all, I love this site!)
You guys are something else...
You're all correct to varying degrees. There are network bottlenecks, peak usage times for the internet as a whole, high db access, and a lot of swapping going on. All of these things together cause slow downs at peak periods.
I created a shell script that runs date, uptime, and a vmstat at 5 second intervals for 12 interations. It's been cron'd to run 33 minutes after every hour. We'll see what happens over the next full day or so. Of course the acid test would be Tuesday.
Right now (pretty quiet and the board is fairly speedy):
Fri Sep 3 22:33:00 PDT 2004
22:33:00 up 6:07, 1 user, load average: 0.69, 0.46, 0.45
procs memory swap io system cpu
r b w swpd free inact active si so bi bo in cs us sy id
3 0 1 94932 5232 4488 180880 4 20 118 156 206 115 36 8 56
0 0 0 94932 5260 4488 180308 0 0 0 171 141 97 10 2 88
0 1 1 94932 4936 4492 182068 0 0 2 70 184 107 16 3 82
2 0 0 95124 3508 5240 182324 0 38 0 434 255 227 54 15 31
1 0 0 95124 4220 4792 181692 0 0 0 425 201 102 53 11 36
0 0 0 95124 4244 4784 181736 0 0 0 42 133 61 6 1 94
0 0 0 95124 3812 6304 181952 0 0 0 33 131 80 59 16 25
0 0 0 95124 3932 4784 182040 0 0 0 444 137 50 15 2 83
1 1 0 95124 3908 4784 183088 0 0 0 18 114 58 5 1 94
0 0 0 95708 3748 4660 184436 0 117 5 157 157 119 23 4 73
Here's this hour's results (even quieter then the previous hour):
Fri Sep 3 23:33:01 PDT 2004
23:33:01 up 7:07, 1 user, load average: 0.44, 0.56, 0.53
procs memory swap io system cpu
r b w swpd free inact active si so bi bo in cs us sy id
2 1 1 95004 12148 4092 184604 4 18 103 152 198 110 34 8 58
0 0 0 95004 11496 4084 185080 0 0 0 135 156 111 31 4 65
0 0 0 95004 11240 5496 184264 0 0 0 349 218 76 53 14 33
0 0 0 95004 11620 4084 184176 0 0 0 21 145 66 18 3 79
0 0 0 95004 11620 4084 184200 0 0 3 38 172 80 1 1 99
0 0 0 95004 11620 4084 184724 0 0 0 27 126 62 8 2 90
0 0 0 95004 11620 4084 184404 0 0 0 26 186 75 4 1 94
0 0 0 95004 11628 4084 184420 0 0 4 56 205 112 14 3 83
0 0 0 95004 11628 4084 184424 0 0 0 10 116 43 0 0 100
0 0 0 95004 11620 4084 184848 0 0 0 31 138 69 9 1 90
Users at this hour:
7 guests, 19 members 2 Anonymous Members
I'll post more in the morning. The results up until then should provide a nice quiet baseline. We'll be able to see it go up as the morning goes on. I could script a traceroute via ssh to my mail server so I could include it in the log file for the vmstat script output, but the ROI for that work ain't worth it.
Here's a traceroute from my mail server. It's interesting to note that there is a lag of 50 seconds between hop 7 and 8. The response times don't show it, but it's there. I don't know what it means since it obviously doesn't take that long to contact the server. Anywho...
> traceroute 914world.com
traceroute to 914world.com (66.250.97.205), 64 hops max, 44 byte packets
1 access01-fe6-0-18.ftc.frii.net (216.17.222.6) 0.448 ms 0.422 ms 0.296 ms
2 core01-fe6-0-701.ftc.frii.net (216.17.230.17) 0.749 ms 1.019 ms 0.928 ms
3 core01-atm3-0-32.den.frii.net (216.17.230.42) 3.698 ms 3.817 ms 4.784 ms
4 f29.ba01.b006467-1.den01.atlas.cogentco.com (66.250.5.253) 3.879 ms 3.803 ms 3.805 ms
5 g9-2.core01.den01.atlas.cogentco.com (66.28.5.21) 3.927 ms 3.829 ms 3.808 ms
6 p4-0.core02.sfo01.atlas.cogentco.com (66.28.4.130) 238.887 ms 205.652 ms 218.214 ms
7 g50.ba01.b003070-1.sfo01.atlas.cogentco.com (66.28.5.182) 28.611 ms 27.725 ms 28.216 ms
8 209.17.64.166 (209.17.64.166) 28.969 ms 29.106 ms 29.114 ms
9 64.237.0.250 (64.237.0.250) 29.632 ms 29.475 ms 29.830 ms
10 mail.914world.com (66.250.97.205) 29.281 ms 29.258 ms 29.638 ms
> ping 914world.com
PING 914world.com (66.250.97.205): 56 data bytes
64 bytes from 66.250.97.205: icmp_seq=0 ttl=48 time=29.748 ms
64 bytes from 66.250.97.205: icmp_seq=1 ttl=48 time=29.098 ms
64 bytes from 66.250.97.205: icmp_seq=2 ttl=48 time=29.645 ms
64 bytes from 66.250.97.205: icmp_seq=3 ttl=48 time=29.377 ms
64 bytes from 66.250.97.205: icmp_seq=4 ttl=48 time=29.523 ms
64 bytes from 66.250.97.205: icmp_seq=5 ttl=48 time=29.124 ms
64 bytes from 66.250.97.205: icmp_seq=6 ttl=48 time=29.343 ms
64 bytes from 66.250.97.205: icmp_seq=7 ttl=48 time=29.070 ms
Deader than a doornail now:
4 guests, 13 members 3 Anonymous Members
Sat Sep 4 00:33:00 PDT 2004
00:33:00 up 8:07, 1 user, load average: 0.21, 0.22, 0.27
procs memory swap io system cpu
r b w swpd free inact active si so bi bo in cs us sy id
2 0 0 99384 4152 4092 189356 3 17 90 143 191 105 32 7 60
0 0 0 99384 4348 4064 188552 0 2 10 116 126 87 1 1 98
0 0 0 99384 4368 4064 188560 0 0 0 15 113 46 0 0 100
0 0 0 99384 4016 4200 188524 0 45 0 242 134 62 65 16 18
0 0 0 99384 4336 4192 188068 0 0 0 37 148 64 10 1 89
0 0 0 99384 4308 4192 187892 1 2 2 15 165 82 7 1 92
0 0 0 99384 4316 4192 188020 0 0 2 38 156 52 0 1 99
1 0 0 99384 4316 4192 188020 0 0 0 10 117 45 1 0 99
0 0 0 99384 4204 4720 188020 0 0 0 184 133 58 62 19 19
1 0 0 99384 4188 4716 188036 0 0 0 19 117 58 4 0 96
0 0 0 99388 4020 4720 187792 0 40 0 71 138 75 15 2 83
0 0 0 99388 4020 4716 187792 0 0 0 25 118 46 0 1 99
/usr/sbin/traceroute www.914world.com
1 access01-fe6-0-18.ftc.frii.net (216.17.222.6) 0.530 ms 0.449 ms 0.415 ms
2 core01-fe6-0-701.ftc.frii.net (216.17.230.17) 1.374 ms 0.557 ms 0.646 ms
3 core01-atm3-0-32.den.frii.net (216.17.230.42) 4.273 ms 3.582 ms 4.074 ms
4 f29.ba01.b006467-1.den01.atlas.cogentco.com (66.250.5.253) 3.668 ms 3.909 ms 3.750 ms
5 g9-2.core01.den01.atlas.cogentco.com (66.28.5.21) 4.034 ms 4.133 ms 4.884 ms
6 p4-0.core02.sfo01.atlas.cogentco.com (66.28.4.130) 27.587 ms 28.403 ms 28.783 ms
7 g50.ba01.b003070-1.sfo01.atlas.cogentco.com (66.28.5.182) 28.071 ms 28.048 ms 27.662 ms
8 209.17.64.166 (209.17.64.166) 29.317 ms 29.904 ms 30.597 ms
9 64.237.0.250 (64.237.0.250) 29.540 ms 29.564 ms 29.594 ms
10 914world.com (66.250.97.205) 29.850 ms 29.430 ms 29.588 ms
> ping www.914world.com
PING www.914world.com (66.250.97.205): 56 data bytes
64 bytes from 66.250.97.205: icmp_seq=0 ttl=48 time=29.776 ms
64 bytes from 66.250.97.205: icmp_seq=1 ttl=48 time=29.579 ms
64 bytes from 66.250.97.205: icmp_seq=2 ttl=48 time=29.306 ms
64 bytes from 66.250.97.205: icmp_seq=3 ttl=48 time=30.819 ms
64 bytes from 66.250.97.205: icmp_seq=4 ttl=48 time=29.729 ms
64 bytes from 66.250.97.205: icmp_seq=5 ttl=48 time=29.264 ms
Really, really, really quiet now.
2 guests, 7 members 1 Anonymous Members
Sat Sep 4 01:32:59 PDT 2004
01:32:59 up 9:07, 1 user, load average: 0.12, 0.15, 0.10
procs memory swap io system cpu
r b w swpd free inact active si so bi bo in cs us sy id
3 0 0 99676 4560 4340 188828 3 15 81 133 183 100 30 7 64
0 0 0 99832 4348 4336 187952 0 39 0 209 136 85 1 1 98
0 0 0 99832 4348 4336 187952 0 0 0 10 129 49 0 0 100
0 0 0 99832 4348 4336 187952 0 0 0 10 128 46 0 0 100
0 0 0 99832 4348 4336 187960 0 0 1 7 131 53 0 0 100
0 0 0 99832 4348 4336 187964 0 0 0 14 129 53 0 0 100
0 0 0 99832 4348 4828 188116 0 0 0 18 122 59 25 6 69
0 0 0 99832 4244 4336 187972 0 0 0 38 128 47 41 10 49
0 0 0 99832 4260 4336 187972 0 0 0 8 120 62 6 1 93
0 0 0 99832 4256 4336 187972 0 0 0 30 110 43 0 0 100
0 0 0 99832 4184 5588 188124 0 0 0 22 114 57 47 11 42
1 0 0 99832 4172 4544 187724 0 1 0 21 134 46 21 3 76
traceroute www.914world.com
1 access01-fe6-0-18.ftc.frii.net (216.17.222.6) 0.404 ms 0.424 ms 0.457 ms
2 core01-fe6-0-701.ftc.frii.net (216.17.230.17) 0.669 ms 0.557 ms 0.587 ms
3 core01-atm3-0-32.den.frii.net (216.17.230.42) 4.226 ms 4.102 ms 3.595 ms
4 f29.ba01.b006467-1.den01.atlas.cogentco.com (66.250.5.253) 4.106 ms 4.489 ms 4.007 ms
5 g9-2.core01.den01.atlas.cogentco.com (66.28.5.21) 4.314 ms 3.762 ms 3.877 ms
6 p4-0.core02.sfo01.atlas.cogentco.com (66.28.4.130) 27.558 ms 58.140 ms 28.216 ms
7 g50.ba01.b003070-1.sfo01.atlas.cogentco.com (66.28.5.182) 27.871 ms 27.773 ms 28.242 ms
8 209.17.64.166 (209.17.64.166) 29.123 ms 29.019 ms 29.554 ms
9 64.237.0.250 (64.237.0.250) 29.880 ms 29.672 ms 30.002 ms
10 www.914world.com (66.250.97.205) 29.429 ms 30.066 ms 29.213 ms
ping -c 7 www.914world.com
PING www.914world.com (66.250.97.205): 56 data bytes
64 bytes from 66.250.97.205: icmp_seq=0 ttl=48 time=29.910 ms
64 bytes from 66.250.97.205: icmp_seq=1 ttl=48 time=29.596 ms
64 bytes from 66.250.97.205: icmp_seq=2 ttl=48 time=29.345 ms
64 bytes from 66.250.97.205: icmp_seq=3 ttl=48 time=29.355 ms
64 bytes from 66.250.97.205: icmp_seq=4 ttl=48 time=29.073 ms
64 bytes from 66.250.97.205: icmp_seq=5 ttl=48 time=29.923 ms
64 bytes from 66.250.97.205: icmp_seq=6 ttl=48 time=29.339 ms
--- www.914world.com ping statistics ---
7 packets transmitted, 7 packets received, 0% packet loss
round-trip min/avg/max/stddev = 29.073/29.506/29.923/0.295 ms
Gint, that lag between hops 7 & 8 on your traceroute is the name lookup for that hop timing out. Use -n to skip the name lookup, and it will sail right past that. The reverse DNS for that 209 hop is having problems. You can't look it up with dig -x, either.
Interesting. Good data. Note that disk I/O (bi bo) is up substantially (over 600 regularly), and the CPU is, indeed, pegged. Not swapping as much as in earlier dumps.
This is probably entirely moot, since Andy is talking about moving the whole thing to a much better box, but...
My guess is the bottleneck is split between the DB server and Apache competing for CPU time, and a small amount of thrashing on RAM. I'd guess now that just adding RAM wouldn't make a huge difference. The disks are starting to get a bit busy. If sticking with the existing machine was a limitation, I'd next investigate the following:
What's the avg. query rate for MySQL (mysqladmin stat; sleep 5; mysqladmin stat. Subtract the two "Questions" numbers, divide by 5, there's your queries per second.)? What percentage of the HTTP queries are image serving (would require a quick Perl script to parse some access logs)? What's the split between MySQL and Apache in CPU usage (top will tell you this)?
After answering these questions, there are various configuration changes that could be made, most of them "free". However, throwing hardware at the problem is easier, and sounds like it's going to happen, anyway. If I had the luxury, what I'd probably do first is move the MySQL DB to a different box, and leave the site where it is.
Powered by Invision Power Board (http://www.invisionboard.com)
© Invision Power Services (http://www.invisionpower.com)