Jump to content

Decreasing throughput & increasing CPU when writing a huge file 20


Recommended Posts

Guest CrashHunter
Posted

I have a Windows 2003 Server x64 Enterprise Edition with SP2 with 4GB RAM and

an application writing a huge file.

The write throughput is quite good in the beginning (~33MB/s) but it keeps

decreasing, while the CPU keeps increasing. In the beginning, the Kernel CPU

(both in the Task Manager and the Processor\% Privileged Time in performance

monitor) is pretty low, but it keeps increasing. Total CPU is ~ 50%, with

Kernel taking ~8% in the beginning, while later on, the Total CPU reaches

80-90% with Kernel using almost all of that CPU; at that point, the

throughput is very low.

Other numbers: the System Cache (in Task Manager) reaches very soon 3.5 GB

and it stays at that value, but the one that seems to be the problem is the

Paged Pool, which keeps increasing.

In poolmon, I can see that Mmst is the one that keeps increasing and it does

not free the memory unless I stop the write to file. Before starting the

process the Mmst uses 1.8MB, while after 1h:20min it uses 200 MB (at that

point, the Total CPU avarage is 74, with 48% in Privileged mode).

I read some info about the Paged pool (including the KB304101), but most of

them apply to x86 version, which has a limited value for the max pool size

(some 460 MB). On my computer (x64), the size should not be a problem (120GB

is a max value) and I do not get errors but the performance steadily goes

down, even with the Paged Pool under 100 MB !

I do not have anything else running on this computer and the behavior is

reproducible every time. As soon as I stop the process, the System Cache and

Paged Pool memory usage go down, so there is no memory leak.

My application writes data to disk using regular WriteFile API, with an

overlapped structure to write asynchronously. It writes a buffer, processes

the next one and then waits for the previous write to complete before issuing

another write request.

I also tried to use the FILE_FLAG_WRITE_THROUGH flag when opening the file;

the general behavior is similar: increasing Paged Pool, increasing usage of

CPU usage in Kernel mode and decreasing throughput, with some differences,

like the starting throughput is much lower (~6MB/s), and it goes down

slightly slower than the other scenario.

  • Replies 6
  • Created
  • Last Reply
Guest CrashHunter
Posted

RE: Decreasing throughput & increasing CPU when writing a huge file 20

 

I can reproduce the same behavior even with a simple tool just writing random

data to a file (256 GB). This tool only uses synchronous WriteFile. It takes

a little longer than with the async one, but the behavior follows the same

pattern:

After running it for ~3h, the speed is ~7.5MB/s, the processor time is 77%

out of which 60% is in privileged mode, the total kernel memory is 384MB (351

being paged memory) and the Mmst pool uses 317 MB)

 

Any suggestion would be greatly appreciated.

Guest CrashHunter
Posted

RE: Decreasing throughput & increasing CPU when writing a huge fil

 

RE: Decreasing throughput & increasing CPU when writing a huge fil

 

Another update (the simple app writing random data):

after 6hours, the average speed is ~5.5 MB/s, the processor time is 81.8%

out of which 69% is in privileged mode, the total kernel memory is 510 MB (474

being paged memory) and the Mmst pool uses 443.5 MB).

Guest Tony Sperling
Posted

Re: Decreasing throughput & increasing CPU when writing a huge file 20

 

I'm not really qualified to make assumptions - just guesses. If your HD(s)

and subsystem are relatively modern and properly configured ( it is many

years since I've seen figures as low as 33MB/s!) then I would suspect your

application.

 

I don't like all this buffer shuffling, you shouldn't have to do that. If

you know the size of the file, in my day you just created a file and filled

it up (the system serves you the buffer it needs) if you don't know the size

you employ some recursive programing (do this - do that - do it all again

untill you're finished). I'm sorry, but it looks like you have been working

hard to make a simple job infinitely more complicated, and succeded. ;0)

 

But, I really don't feel qualified to pass judgements.

 

(What is your Hardware?)

 

What figures do you get if you run a HD benchmark like HDTach or HD Tune?

 

 

Tony. . .

Guest CrashHunter
Posted

Re: Decreasing throughput & increasing CPU when writing a huge fil

 

Re: Decreasing throughput & increasing CPU when writing a huge fil

 

The hardware is pretty old; the specs are:

- one Dynamic volume stripped over 2 x 900 GB RAID5 disk subsystems

- HBA: QLogic QLA2340 PCI Fibre Channel Adapter

- SAN: Metastore (emulating IBM 3526 0401) with 2 x LSI (INF-01-00)

controllers with a total of 30 x 72GB Seagate 10K SCSI disks

 

The HDTach results (for read, since it is the trial version):

- Random access: 10.5 ms

- CPU utilization: 1%

- Avg speed: 44.1 MB/s

 

The HDTune:

- Transfer rate (Min/Max/Avg): 13.7/24.2/20.6

- Burst speed: 51.2

- Access Time: 10.2 ms

 

The results were similar for each of the 2 disk subsystems

 

In regards to making a simple job complicated, I started from the existing

implementation of our application and ending using a simple application with

just a loop, generating random data and writing it to the disk (just simple

synchronous WriteFile calls, no extra buffering, or anything else)

 

I restarted the test writing to the local 250GB WDC-WD2500KS-00MJB0 SATA

disk to eliminate the potential un-optimized SAN configuration and, again,

the behavior follows the same pattern: the write started with ~22.5 GB/s and

~1% CPU in privileged mode; after writing ~50GB, the speed dropped to 20.2

MB/s, 7.9% CPU in privileged mode and paged pool of 150MB (~123MB in the Mmst

pool).

 

again the tools does something like

while (size less than targeted one) {

generate buffer with random bytes

write buffer to file

}

 

The only thing I can point to now is the OS; it either needs some fine

tunning or it has a problem...

 

 

 

"Tony Sperling" wrote:

> I'm not really qualified to make assumptions - just guesses. If your HD(s)

> and subsystem are relatively modern and properly configured ( it is many

> years since I've seen figures as low as 33MB/s!) then I would suspect your

> application.

>

> I don't like all this buffer shuffling, you shouldn't have to do that. If

> you know the size of the file, in my day you just created a file and filled

> it up (the system serves you the buffer it needs) if you don't know the size

> you employ some recursive programing (do this - do that - do it all again

> untill you're finished). I'm sorry, but it looks like you have been working

> hard to make a simple job infinitely more complicated, and succeded. ;0)

>

> But, I really don't feel qualified to pass judgements.

>

> (What is your Hardware?)

>

> What figures do you get if you run a HD benchmark like HDTach or HD Tune?

>

>

> Tony. . .

>

>

>

Guest CrashHunter
Posted

Re: Decreasing throughput & increasing CPU when writing a huge fil

 

Re: Decreasing throughput & increasing CPU when writing a huge fil

 

I have some more news:

- I changed the PagedPoolSize to 2GB and the PoolUsageMaximum to 5

(therefore to 100MB) hoping to see a difference. Although these changes took

effect (the Mmst pool was kept under 100 MB - about 98MB) the trend was

exactly the same !!! After writing 160 GB in about 3h:20m, the CPU

utilization in privileged mode is over 60 and the speed has been constantly

going down.

The write is still to the local SATA drive (an empty volume), no other

application running (except the monitoring ones)

 

I am out of ideas here. Can anybody give me some (constructive) suggestions?

 

"CrashHunter" wrote:

> The hardware is pretty old; the specs are:

> - one Dynamic volume stripped over 2 x 900 GB RAID5 disk subsystems

> - HBA: QLogic QLA2340 PCI Fibre Channel Adapter

> - SAN: Metastore (emulating IBM 3526 0401) with 2 x LSI (INF-01-00)

> controllers with a total of 30 x 72GB Seagate 10K SCSI disks

>

> The HDTach results (for read, since it is the trial version):

> - Random access: 10.5 ms

> - CPU utilization: 1%

> - Avg speed: 44.1 MB/s

>

> The HDTune:

> - Transfer rate (Min/Max/Avg): 13.7/24.2/20.6

> - Burst speed: 51.2

> - Access Time: 10.2 ms

>

> The results were similar for each of the 2 disk subsystems

>

> In regards to making a simple job complicated, I started from the existing

> implementation of our application and ending using a simple application with

> just a loop, generating random data and writing it to the disk (just simple

> synchronous WriteFile calls, no extra buffering, or anything else)

>

> I restarted the test writing to the local 250GB WDC-WD2500KS-00MJB0 SATA

> disk to eliminate the potential un-optimized SAN configuration and, again,

> the behavior follows the same pattern: the write started with ~22.5 GB/s and

> ~1% CPU in privileged mode; after writing ~50GB, the speed dropped to 20.2

> MB/s, 7.9% CPU in privileged mode and paged pool of 150MB (~123MB in the Mmst

> pool).

>

> again the tools does something like

> while (size less than targeted one) {

> generate buffer with random bytes

> write buffer to file

> }

>

> The only thing I can point to now is the OS; it either needs some fine

> tunning or it has a problem...

>

>

Guest Tony Sperling
Posted

Re: Decreasing throughput & increasing CPU when writing a huge fil

 

Re: Decreasing throughput & increasing CPU when writing a huge fil

 

As I said, I am no good in a server environment, but I would certainly

expect an older HD system to be more of a bottleneck on a faster machine.

The phenomenon of the data throughput slowing down is pretty much standard,

I believe. I've never run a benchmark for the lenght of time you are

employing but 30 - 40% slowdown over a few minutes I would expect on a

standard IDE system. My own current SATA/RAID0 shows a nearly flat curve

over a few minutes time, hovering around a 100 MB/s.

 

You might consider tweaking your machine's use of resources depending on

wether you are runnign those tests in the foreground or background. I would

make sure I had plenty of swap and I would check the signal cables to those

disks if they are of the same generation. Temperature, might also be an

issue with HD's working hard over long periods.

 

I believe too, that servers have an option to tweak the system cache far

more than the Pro Editions I'm used to.

 

In short, what you are seeing may be quite natural - but you may be able to

beat more performance out of it.

 

( I suggest to pay a visit to the 'Knowledge Base' - go there and search for

"system cache", quite a few hits there. Something might lead you further?)

 

 

Tony. . .


×
×
  • Create New...