|
|
#11 (permalink) |
|
Guest
Posts: n/a
|
On 02/22/2012 05:52 PM, Gene Wirchenko wrote:
> On Wed, 22 Feb 2012 16:03:41 -0500, Jeff Higgins > <jeff@invalid.invalid> wrote: > >> On 02/22/2012 04:00 PM, Jeff Higgins wrote: >>> On 02/22/2012 02:15 PM, Roedy Green wrote: >>>> >>>> Back before the Internet, I was pushing for what I call "Marthaing" >>>> drives. We might get them any year now. >>> Who is Martha? Back before the Internet I was advocating for squeezable >>> catsup bottles. We have'em, but I haven't got a dime for'em. ![]() > >> <http://uncyclopedia.wikia.com/wiki/Ketchup_v._Catsup> > > I noticed that the tone is not as academic as Wikipedia. > Yep ![]() "Tomato ketchup is a pseudoplastic — or "shear thinning" substance — which can make it difficult to pour from a glass bottle." |
|
|
|
#12 (permalink) |
|
Guest
Posts: n/a
|
On 2/22/12 3:24 PM, Jeff Higgins wrote:
> On 02/22/2012 05:52 PM, Gene Wirchenko wrote: >> On Wed, 22 Feb 2012 16:03:41 -0500, Jeff Higgins >> <jeff@invalid.invalid> wrote: >> >>> On 02/22/2012 04:00 PM, Jeff Higgins wrote: >>>> On 02/22/2012 02:15 PM, Roedy Green wrote: >>>>> >>>>> Back before the Internet, I was pushing for what I call "Marthaing" >>>>> drives. We might get them any year now. >>>> Who is Martha? Back before the Internet I was advocating for squeezable >>>> catsup bottles. We have'em, but I haven't got a dime for'em. ![]() >> >>> <http://uncyclopedia.wikia.com/wiki/Ketchup_v._Catsup> >> >> I noticed that the tone is not as academic as Wikipedia. >> > Yep ![]() > "Tomato ketchup is a pseudoplastic — or "shear thinning" substance — > which can make it difficult to pour from a glass bottle." > Edible Non Newtonian fluids FTW |
|
|
|
#13 (permalink) |
|
Guest
Posts: n/a
|
On Wed, 22 Feb 2012 14:27:17 -0800, Lew wrote:
> Modern hard drives, pretty much all of them, have a buffer and > microprocessor as part of the hardware. We're not going to get any > "Marthaing" as you describe it (wherever the heck /that/ term came from) > because what they're already doing is already so effective. > > What they mostly do is collect read and write requests and combine them > in elevator-seek order, along with full-track readahead. This optimizes > disk access for single sweeps of the drive heads. > Agreed, and a mainframe OS I was using in the early '70s (ICL's George 3) was doing it back then and very effective it is too for speeding up disk access. Back in the day it pushed the speed of the 2800 rpm, 60 MB washing-machine sized disk drives up from around 8 accesses/sec to something like 20-30 per sec. However, its ineffective unless there are many active processes simultaneously requesting disk i/o. If all the requests come from one single threaded process then it can't optimize head movement because there's never more than one pending request at a time. I know this is reduction ad absurdam, but it does make the point that a small active process population is unlikely to be optimised as well as a large one. This is relevant today for allmost all single-user workstations regardless of whether they are running Windows, Linux or OS X. Since the majority of applications run on these machines are single threaded, about the only time you have more than one process accessing the disk is when the user is hammering away at a task, be it wordprocessing, spread-sheet, browser or IDE and the mail reader, sitting in the background, finds some mail waiting. > The on-drive buffer > also holds enough data for most reads and writes, overtaking any > advantage that any (perforce extremely slow) physical re-ordering of the > tracks could accomplish. > Yep, the on-drive buffer will almost always be capable of holding several physical tracks and, in addition, on a *NIX system anyway, all RAM not occupied by running processes and their data will contain disk buffers. -- martin@ | Martin Gregorie gregorie. | Es***, UK org | |
|
|
|
#14 (permalink) |
|
Guest
Posts: n/a
|
On 2/22/2012 8:31 PM, Martin Gregorie wrote:
> On Wed, 22 Feb 2012 14:27:17 -0800, Lew wrote: >> Modern hard drives, pretty much all of them, have a buffer and >> microprocessor as part of the hardware. We're not going to get any >> "Marthaing" as you describe it (wherever the heck /that/ term came from) >> because what they're already doing is already so effective. >> >> What they mostly do is collect read and write requests and combine them >> in elevator-seek order, along with full-track readahead. This optimizes >> disk access for single sweeps of the drive heads. >> > Agreed, and a mainframe OS I was using in the early '70s (ICL's George 3) > was doing it back then and very effective it is too for speeding up disk > access. Back in the day it pushed the speed of the 2800 rpm, 60 MB > washing-machine sized disk drives up from around 8 accesses/sec to > something like 20-30 per sec. > > However, its ineffective unless there are many active processes > simultaneously requesting disk i/o. If all the requests come from one > single threaded process then it can't optimize head movement because > there's never more than one pending request at a time. I know this is > reduction ad absurdam, but it does make the point that a small active > process population is unlikely to be optimised as well as a large one. > This is relevant today for allmost all single-user workstations > regardless of whether they are running Windows, Linux or OS X. Since the > majority of applications run on these machines are single threaded, about > the only time you have more than one process accessing the disk is when > the user is hammering away at a task, be it wordprocessing, spread-sheet, > browser or IDE and the mail reader, sitting in the background, finds some > mail waiting. Most OS'es support async IO. Arne |
|
|
|
#15 (permalink) |
|
Guest
Posts: n/a
|
On Wed, 22 Feb 2012 18:24:06 -0500, Jeff Higgins
<jeff@invalid.invalid> wrote: >On 02/22/2012 05:52 PM, Gene Wirchenko wrote: >> On Wed, 22 Feb 2012 16:03:41 -0500, Jeff Higgins >> <jeff@invalid.invalid> wrote: >> >>> On 02/22/2012 04:00 PM, Jeff Higgins wrote: >>>> On 02/22/2012 02:15 PM, Roedy Green wrote: >>>>> >>>>> Back before the Internet, I was pushing for what I call "Marthaing" >>>>> drives. We might get them any year now. >>>> Who is Martha? Back before the Internet I was advocating for squeezable >>>> catsup bottles. We have'em, but I haven't got a dime for'em. ![]() >> >>> <http://uncyclopedia.wikia.com/wiki/Ketchup_v._Catsup> >> >> I noticed that the tone is not as academic as Wikipedia. >> >Yep ![]() >"Tomato ketchup is a pseudoplastic — or "shear thinning" substance — >which can make it difficult to pour from a glass bottle." "Would you like fries with that?" Sincerely, Gene Wirchenko |
|
|
|
#16 (permalink) |
|
Guest
Posts: n/a
|
On Wed, 22 Feb 2012 21:40:19 -0500, Arne Vajhøj wrote:
> > Most OS'es support async IO. > Yes, I know, but its not relevant to a single-threaded process since its logic generally requires it to wait for a read or write to complete before it continues[1]. Hence my comment that this prevents head movement being optimized unless a lot of processes are active because there's only one outstanding IOP per process. [1] unless you're deliberately doing async i/o using poll() or select() (in C) or nio (in Java), in which case the process is often best regarded as a half-way house between single and multi-threaded logic. -- martin@ | Martin Gregorie gregorie. | Es***, UK org | |
|
|
|
#17 (permalink) |
|
Guest
Posts: n/a
|
On 2/23/2012 3:16 PM, Martin Gregorie wrote:
> On Wed, 22 Feb 2012 21:40:19 -0500, Arne Vajhøj wrote: > >> >> Most OS'es support async IO. >> > Yes, I know, but its not relevant to a single-threaded process since its > logic generally requires it to wait for a read or write to complete > before it continues[1]. Hence my comment that this prevents head movement > being optimized unless a lot of processes are active because there's only > one outstanding IOP per process. > > [1] unless you're deliberately doing async i/o using poll() or > select() (in C) or nio (in Java), in which case the process is often > best regarded as a half-way house between single and multi-threaded > logic. > > There are some exceptions to this. For example, if you are reading a file sequentially, the OS may prefetch blocks you have not yet requested, and have multiple reads outstanding as a result. Depending on the OS and how the IO is being handled, a write may appear to be complete from the program's point of view once the data has been copied to a kernel buffer. The OS may be writing out modified blocks, including swap space blocks, at any time. Patricia |
|
|
|
#18 (permalink) |
|
Guest
Posts: n/a
|
Martin Gregorie wrote: > Unlike some, I take a good deal of interest in what my machines are up > to, so I was quoting what I see using top on my Linux systems. During > normal operation there is very little activity on my laptop except from > the programs I'm actively using unless, as you say, logwatch/smartd/ > rkhunter/updatedb get run by atd, but on a reasonably quick machine they > don't run for long. > > Of course, the house server is a different case, since it has several > 24/7 services on it, but again its only heavy, continuous disk activity > is overnight when it runs backups/logwatch/smartd/updatedb. Apart from > that requests that wake up Postfix/Spamassassin/Apache/or ftpd/sshd are > pretty sporadic and the disk LED flashes are best described as > intermittent. Sounds like disk optimizations would help that system. > The longest continuously busy time on either machine is during backups > and even there there precious little contention since rsync or tar+gzip > since the only stuff being written to the disk its reading from are > backup logs. Same applies to software update sessions. To the best of my > knowledge (and watching top) none of yum, rpm, tar, gzip or rsync are > multi-threaded: rsync is probably using poll() based async i/o but from > top and observed behaviour none of the others seem to do that. In fact > the only long-running programs on my systems that I know to be multi- > threaded are Apache, Postgres, SA and Postfix. Now /that/ is objective evidence. In your particular case you have no need of optimization of your disk processes. You don't mention it but by omission I will grant you that virtual memory on your system does not seriously contend for disk either. But a typical consumer scenario is to listen to a stream while surfing the web on Windows with several chat windows open, causing multiple disk IO ops on a constant basis of themselves and also putting pressure on virtual memory. Even such a single-user system can benefit from elevator seeking and on-disk buffers. Consider also that burstiness of demand does not argue against the need for optimization, really. During bursts the optimization helps, and a user might complain if their disks got weird once an hour. Regardless, if you don't need optimization why worry? It's like the Pope comparing brands of condoms. Again, we don't excoriate the value of optimizations by citing examples where optimization isn't needed. We evaluate optimizations by how useful they are when they are needed. -- Lew Honi soit qui mal y pense. http://upload.wikimedia.org/wikipedi.../c/cf/Friz.jpg |
|
|
|
#19 (permalink) |
|
Guest
Posts: n/a
|
On Thu, 23 Feb 2012 21:45:33 -0800, Patricia Shanahan wrote:
> On 2/23/2012 3:16 PM, Martin Gregorie wrote: >> On Wed, 22 Feb 2012 21:40:19 -0500, Arne Vajhøj wrote: >> >> >>> Most OS'es support async IO. >>> >> Yes, I know, but its not relevant to a single-threaded process since >> its logic generally requires it to wait for a read or write to complete >> before it continues[1]. Hence my comment that this prevents head >> movement being optimized unless a lot of processes are active because >> there's only one outstanding IOP per process. >> >> [1] unless you're deliberately doing async i/o using poll() or >> select() (in C) or nio (in Java), in which case the process is >> often best regarded as a half-way house between single and >> multi-threaded logic. >> >> >> > There are some exceptions to this. For example, if you are reading a > file sequentially, the OS may prefetch blocks you have not yet > requested, and have multiple reads outstanding as a result. > Fair point, and I've seen blinding speed from reads where the disk drivers used track reads, but it still doesn't affect my point that there's still only one I/O request in the queue per active single threaded process. Head movement optimisation is simply sidestepped in this case. > Depending on the OS and how the IO is being handled, a write may appear > to be complete from the program's point of view once the data has been > copied to a kernel buffer. The OS may be writing out modified blocks, > including swap space blocks, at any time. > Again agreed: its fair to regard a write as complete from the program's POV as soon as it can reread the block/record - something that many indexed sequential access schemes need to do to re-establish a 'current record' pointer. -- martin@ | Martin Gregorie gregorie. | Es***, UK org | |
|
|
|
#20 (permalink) |
|
Guest
Posts: n/a
|
On Fri, 24 Feb 2012 10:50:39 -0800, Lew wrote:
> Martin Gregorie wrote: >> Unlike some, I take a good deal of interest in what my machines are up >> to, so I was quoting what I see using top on my Linux systems. During >> normal operation there is very little activity on my laptop except from >> the programs I'm actively using unless, as you say, logwatch/smartd/ >> rkhunter/updatedb get run by atd, but on a reasonably quick machine >> they don't run for long. >> >> Of course, the house server is a different case, since it has several >> 24/7 services on it, but again its only heavy, continuous disk activity >> is overnight when it runs backups/logwatch/smartd/updatedb. Apart from >> that requests that wake up Postfix/Spamassassin/Apache/or ftpd/sshd are >> pretty sporadic and the disk LED flashes are best described as >> intermittent. > > Sounds like disk optimizations would help that system. > Probably not - they are all cron jobs and hence get run sequentially. >> The longest continuously busy time on either machine is during backups >> and even there there precious little contention since rsync or tar+gzip >> since the only stuff being written to the disk its reading from are >> backup logs. Same applies to software update sessions. To the best of >> my knowledge (and watching top) none of yum, rpm, tar, gzip or rsync >> are multi-threaded: rsync is probably using poll() based async i/o but >> from top and observed behaviour none of the others seem to do that. In >> fact the only long-running programs on my systems that I know to be >> multi- threaded are Apache, Postgres, SA and Postfix. > > In your particular case you have no need of optimization of your disk > processes. You don't mention it but by omission I will grant you that > virtual memory on your system does not seriously contend for disk > either. > Well spotted. My type of load almost never swaps. That was the case with the old 512 MB RAM box and is double true with its replacement (4 GB RAM), but that still doesn't stop me setting swap space at twice RAM. In fact the only program I have that does use gobs on RAM is a JavaMail + Postgres app and I'm not sure if its a problem due to JavaMail's queueing or if I've got overly long lived Object instances. Tracking this down in on my to-do list. All I know at present is that the same program using the same JVM uses gobs more RAM on the new machine (which is 6 times faster as well as having 8x more RAM), so it might simply be a case of persuading the GC to run more often. > But a typical consumer scenario is to listen to a stream while > surfing the web on Windows with several chat windows open, causing > multiple disk IO ops on a constant basis of themselves and also putting > pressure on virtual memory. Even such a single-user system can benefit > from elevator seeking and on-disk buffers. > I'm not saying head movement optimisation is a bad thing, just that it can be difficult to get enough queued requests for it to work without a large population of active processes that all do a lot of disk accesses. You may well be right about the typical consumer setup: I lack any experience that: all I understand is the pattern that my own use pattern generates. However, I would point out that streamed music or video may never touch the disk (though of course a torrent will). The amount of disk i/o due to chat/IM/Twitter/web browsers may be less that we'd expect because its (a) very bursty and (b) disk i/o time is vastly outweighed by human reading and typing time. > Consider also that burstiness of demand does not argue against the need > for optimization, really. During bursts the optimization helps, and a > user might complain if their disks got weird once an hour. > Sure, but the user's activity scan and resulting interaction with one program at a time, which may well be single threaded, for a few minutes before switching to another. This tends to produce widely separated bursts of i/o from one or two processes. > Regardless, if you don't need optimization why worry? It's like the Pope > comparing brands of condoms. > Like it! > Again, we don't excoriate the value of optimizations by citing examples > where optimization isn't needed. We evaluate optimizations by how useful > they are when they are needed. > I wasn't intending to do that, having seen just how well head scheduling works. I merely intended to point out that there are corner cases where such algorithms don't help - but are not a hindrance either. -- martin@ | Martin Gregorie gregorie. | Es***, UK org | |
|