Saturday, February 04, 2006

Understanding memory usage on Linux

This entry is for those people who have ever wondered, "Why the hell is a simple KDE text editor taking up 25 megabytes of memory?" Many people are led to believe that many Linux applications, especially KDE or Gnome programs, are "bloated" based solely upon what tools like ps report. While this may or may not be true, depending on the program, it is not generally true -- many programs are much more memory efficient than they seem.

What ps reports
The ps tool can output various pieces of information about a process, such as its process id, current running state, and resource utilization. Two of the possible outputs are VSZ and RSS, which stand for "virtual set size" and "resident set size", which are commonly used by geeks around the world to see how much memory processes are taking up.

For example, here is the output of ps aux for KEdit on my computer:

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
dbunker 3468 0.0 2.7 25400 14452 ? S 20:19 0:00 kdeinit: kedit

According to ps, KEdit has a virtual size of about 25 megabytes and a resident size of about 14 megabytes (both numbers above are reported in kilobytes). It seems that most people like to randomly choose to accept one number or the other as representing the real memory usage of a process. I'm not going to explain the difference between VSZ and RSS right now but, needless to say, this is the wrong approach; neither number is an accurate picture of what the memory cost of running KEdit is.

Why ps is "wrong"
Depending on how you look at it, ps is not reporting the real memory usage of processes. What it is really doing is showing how much real memory each process would take up if it were the only process running. Of course, a typical Linux machine has several dozen processes running at any given time, which means that the VSZ and RSS numbers reported by ps are almost definitely "wrong". In order to understand why, it is necessary to learn how Linux handles shared libraries in programs.

Most major programs on Linux use shared libraries to facilitate certain functionality. For example, a KDE text editing program will use several KDE shared libraries (to allow for interaction with other KDE components), several X libraries (to allow it to display images and copy and pasting), and several general system libraries (to allow it to perform basic operations). Many of these shared libraries, especially commonly used ones like libc, are used by many of the programs running on a Linux system. Due to this sharing, Linux is able to use a great trick: it will load a single copy of the shared libraries into memory and use that one copy for every program that references it.

For better or worse, many tools don't care very much about this very common trick; they simply report how much memory a process uses, regardless of whether that memory is shared with other processes as well. Two programs could therefore use a large shared library and yet have its size count towards both of their memory usage totals; the library is being double-counted, which can be very misleading if you don't know what is going on.

Unfortunately, a perfect representation of process memory usage isn't easy to obtain. Not only do you need to understand how the system really works, but you need to decide how you want to deal with some hard questions. Should a shared library that is only needed for one process be counted in that process's memory usage? If a shared library is used my multiple processes, should its memory usage be evenly distributed among the different processes, or just ignored? There isn't a hard and fast rule here; you might have different answers depending on the situation you're facing. It's easy to see why ps doesn't try harder to report "correct" memory usage totals, given the ambiguity.

Seeing a process's memory map
Enough talk; let's see what the situation is with that "huge" KEdit process. To see what KEdit's memory looks like, we'll use the pmap program (with the -d flag):

Address Kbytes Mode Offset Device Mapping
08048000 40 r-x-- 0000000000000000 0fe:00000 kdeinit
08052000 4 rw--- 0000000000009000 0fe:00000 kdeinit
08053000 1164 rw--- 0000000008053000 000:00000 [ anon ]
40000000 84 r-x-- 0000000000000000 0fe:00000 ld-2.3.5.so
40015000 8 rw--- 0000000000014000 0fe:00000 ld-2.3.5.so
40017000 4 rw--- 0000000040017000 000:00000 [ anon ]
40018000 4 r-x-- 0000000000000000 0fe:00000 kedit.so
40019000 4 rw--- 0000000000000000 0fe:00000 kedit.so
40027000 252 r-x-- 0000000000000000 0fe:00000 libkparts.so.2.1.0
40066000 20 rw--- 000000000003e000 0fe:00000 libkparts.so.2.1.0
4006b000 3108 r-x-- 0000000000000000 0fe:00000 libkio.so.4.2.0
40374000 116 rw--- 0000000000309000 0fe:00000 libkio.so.4.2.0
40391000 8 rw--- 0000000040391000 000:00000 [ anon ]
40393000 2644 r-x-- 0000000000000000 0fe:00000 libkdeui.so.4.2.0
40628000 164 rw--- 0000000000295000 0fe:00000 libkdeui.so.4.2.0
40651000 4 rw--- 0000000040651000 000:00000 [ anon ]
40652000 100 r-x-- 0000000000000000 0fe:00000 libkdesu.so.4.2.0
4066b000 4 rw--- 0000000000019000 0fe:00000 libkdesu.so.4.2.0
4066c000 68 r-x-- 0000000000000000 0fe:00000 libkwalletclient.so.1.0.0
4067d000 4 rw--- 0000000000011000 0fe:00000 libkwalletclient.so.1.0.0
4067e000 4 rw--- 000000004067e000 000:00000 [ anon ]
4067f000 2148 r-x-- 0000000000000000 0fe:00000 libkdecore.so.4.2.0
40898000 64 rw--- 0000000000219000 0fe:00000 libkdecore.so.4.2.0
408a8000 8 rw--- 00000000408a8000 000:00000 [ anon ]
... (trimmed) ...
mapped: 25404K writeable/private: 2432K shared: 0K

I cut out a lot of the output; the rest is similar to what is shown. Even without the complete output, we can see some very interesting things. One important thing to note about the output is that each shared library is listed twice; once for its code segment and once for its data segment. The code segments have a mode of "r-x--", while the data is set to "rw---". The Kbytes, Mode, and Mapping columns are the only ones we will care about, as the rest are unimportant to the discussion.

If you go through the output, you will find that the lines with the largest Kbytes number are usually the code segments of the included shared libraries (the ones that start with "lib" are the shared libraries). What is great about that is that they are the ones that can be shared between processes. If you factor out all of the parts that are shared between processes, you end up with the "writeable/private" total, which is shown at the bottom of the output. This is what can be considered the incremental cost of this process, factoring out the shared libraries. Therefore, the cost to run this instance of KEdit (assuming that all of the shared libraries were already loaded) is around 2 megabytes. That is quite a different story from the 14 or 25 megabytes that ps reported.

What does it all mean?
The moral of this story is that process memory usage on Linux is a complex matter; you can't just run ps and know what is going on. This is especially true when you deal with programs that create a lot of identical children processes, like Apache. ps might report that each Apache process uses 10 megabytes of memory, when the reality might be that the marginal cost of each Apache process is 1 megabyte of memory. This information becomes critial when tuning Apache's MaxClients setting, which determines how many simultaneous requests your server can handle (although see one of my past postings for another way of increasing Apache's performance).

It also shows that it pays to stick with one desktop's software as much as possible. If you run KDE for your desktop, but mostly use Gnome applications, then you are paying a large price for a lot of redundant (but different) shared libraries. By sticking to just KDE or just Gnome apps as much as possible, you reduce your overall memory usage due to the reduced marginal memory cost of running new KDE or Gnome applications, which allows Linux to use more memory for other interesting things (like the file cache, which speeds up file accesses immensely).

112 Comments:

At 9:10 PM, Blogger nikhil said...

Nice explanation of Linux processes and memories, It will help to dispel many of KDE's 'bloated' myths

 
At 9:25 PM, Anonymous Matt Darby said...

Very nice indeed. I've never heard of the 'pmap' program. Looks like I have some research to do. Thanks for the write up!

 
At 12:11 AM, Anonymous Anonymous said...

This completely ignores the fact that each of those shared libraries takes up a number of virtual to physical page mappings. These, unlike memory, are a very precious resource. Just because people tend to ignore this, doesn't mean that shared libraries don't have a high price. Try running something like KDE on non x86 hardware that doesn't have huge translation tables. On mips hardware, I've seen apps spend over 30% of their time updating tlb entries.

 
At 12:32 AM, Anonymous Anonymous said...

Non x86 hardware? Ok, cool, you go play with your oldish Powerbook, or whatevs system. Their coding for a very concise user base...the KDE team that is. They are called the "everyday user" and that consists of 99% x86 and 1% everything else.

If this, and if that's aside...isn't x86 all that is relevant for desktop use? If you are producing weather calcs. you don't need a desktop at all.

 
At 2:15 AM, Anonymous Anonymous said...

non x86 hardware , you know ... like the xbox 360 and almost every other console out there ...

 
At 6:12 AM, Anonymous Anonymous said...

One of the most widely hacked machines, the original X-box, runs on x86 hardware...

 
At 6:12 AM, Anonymous Anonymous said...

Even though non-x86 hardware makes up a minority it's still important. A huge amount of embedded Linux systems use ARM or MIPS CPUs. If no one considered the importance of non-x86 hardware embedded Linux definitely wouldn't be where it is today.

 
At 6:16 AM, Anonymous Anonymous said...

And how many xbox 360 owners do you know that run KDE?

 
At 6:22 AM, Anonymous Anonymous said...

A huge amount of embedded Linux systems use ARM or MIPS CPUs.

Who would use KDE in an embedded system?

 
At 6:25 AM, Blogger nixcraft said...

Excellent article; I am going to recommend to all our group members :)

Keep it up good work.

 
At 6:26 AM, Blogger Rodrigo said...

really good post..
and to think i have always
used ps -blah to view
resources consumption..

Thanks
Rodri

 
At 6:32 AM, Anonymous Anonymous said...

A huge amount of embedded Linux systems use ARM or MIPS CPUs

What embedded system will be running a full desktop environment like KDE or GNOME? It'd put the hardware and power consumption requirements of your device through the roof -KDE and GNOME are designed for desktop/laptop systems. If you want a GUI on an embedded system you'd use something like QTopia.

 
At 6:35 AM, Anonymous Anonymous said...

The ability to use a single physical copy of a shared library isn't a great Linux trick - it's a common feature of just about every OS that uses shared libraries. Btw, the same principal applied to executables as well. For instance, if you have several xterms's running, you will only have one copy of the xterm executable program text in memory, which will be shared between all instances (obviously, each instance will have its own stack and heaps).

 
At 6:37 AM, Anonymous Anonymous said...

Nice explanation, but Linux still suffers from Code Bloat. It used to run nicely on oldish systems, in fact that's how it caught on. Now you really need state of the art X86 hardware. I don't think this is a good idea - there is already a system optimized for those 99% of "other" users.

 
At 6:47 AM, Anonymous Anonymous said...

Excellent article!!!!
Explains a lot why all those programs I wrote seem to take up more memory than I allocated.
'ps' is a bit crap then? can anyone recommend any other programs that read 'actual' memory usage?
Why is it in the kernel if it's so inaccurate.
Oh hum :p

 
At 7:00 AM, Anonymous Emsi Guru said...

You guys seem to forget that every program uses text and data segments. In the very simple example text is the program code, which is shared, in a well designed OS like Linux, among all instances of a given process. The data like the stack or localy assigned memory (malloced) is different for every process. It's very important to not forget about the data as badly written editor (pick any GUI editor) might allocate huge ammount of memory just to open a simple logfile which just happens to be 500MB and this is just plain wrong!

 
At 7:06 AM, Anonymous Anonymous said...

Price of fragmentation is also something to consider. A well-behaved app can still end up eating many, many MB of memory that it isn't actually using.

For example, say the app need to make a number of large allocations for some temporary working space for some operation. It also needs to allocate some book keeping information (undo history, for example) and other bits of info to store/utilize the computed data.

When allocations are made, the heap grows. However, the heap can only shrink by truncation. This is, the program can only release memory back to the OS by shrinking the heap, not by cutting it into pieces. So if the app makes many allocations, and frees only some (or even most) of them, its heap may end up looking like:

X: free memory
U: used memory
UUUXXXXXXXXXXXXU

That's four pages of used memory and 12 pages of unused memory that can't be freed.

Granted, if the app needs to make any more allocations, it can use all of that unused heap. But if it isn't making any more allocations anytime soon, it'll appear to be gobbling up memory. (Which it is.)

With Linux, the problem isn't quite as bad since the unused pages will be swapped out to disk, so you won't run out of memory. However, bad performance will now arise as the VM is swapping out completely unused chunks of memory, and when the app needs to reallocate memory it'll swap back in a bunch of garbage/unused data.

This phenomen is one of the many reasons why good modern garbage collectors are far more efficient than manual memory management with malloc/free. A compacting collector will completely negate the above problem.

 
At 7:12 AM, Anonymous Anonymous said...

VSZ is indeed very important for those of us running our servers with overcommit turned off. Without overcommit, actual physical memory usage is not nearly as important as virtual memory usage.

 
At 7:14 AM, Anonymous Anonymous said...

What embedded system will be running a full desktop environment like KDE or GNOME?

True, but this article does not limit itself to just KDE and Gnome. I suspect you missed the part about Apache. At the very base, it is good to know that several different server applications use shared libraries, thus making their ps foot print larger than reality. And within those apps, child processes or threads use the same libraries, so on and so forth.

This seems pretty useful information to have handy when deciding if a little embedded unit will need 4MB or 8MB of RAM.

 
At 7:18 AM, Anonymous Anonymous said...

Why bother with pmap? 'ps' gives basically the same answer, if you're not addicted to the BSD format. Look for the definition of the 'SZ' column in the man page. It's overly technical, but it boils down to the writable/private data that pmap gave, and you can see the whole list at once. For example

ps -lyu<user>

 
At 7:34 AM, Anonymous Anonymous said...

Good explanation and helpful. I'll use some of the tips from now on. Still have problems with Debian Sarge Konqueror leaking memory. Normally have about a dozen Konqueror windows open, with anywhere from a dozen to several dozen tabs open each. As time progresses (a month and up), and as more tabs are closed then reopened to other new web pages, both virtual and resident set sizes increase. With 512 MB of system memory and other processes running, the system runs well by keeping swap usage below 1 GB (below 900 MB is better), and by starting a new konqueror process and copying the open tabs from a konqueror process whose virtual memory has crept above 100 MB or more, then killing the process once all open tabs have been copied over to the new konqueror process. Since the new Konqueror process hasn't had many tabs opened and then closed, but only has all the tabs from the older Konqueror process opened, the virtual memory usage of the newer konqueror process is far less than 100 MB. With this new konqueror process running, as I open new tabs to visit other web sites, then close them and open other new ones, then virtual memory as indicated in top will slowly continue to rise. Monitoring top, as additional konqueror processes get over 100 MB, it becomes time to start a new konqueror process and copy over the open tabs to free up memory from the old konqueror process.

By doing this, it allows keeping many tabs open (news sites, etc.) to allow time for analysis and later saving even if the original page on the web site itself changes. And it allows uptimes measured in hundreds of days between rebooting (current uptime is 116 days, add about another 200 days if you exclude a reboot to pull a failed drive from a raid array.)

As an example, I have a konqueror process currently using 212 MB of virtual memory, and two more that just exceeded 100 MB each. By copying the open tabs of the 212 MB konqueror process to a new konqueror process, I can get the new konqueror process to below 60-80 MB or so, and once I kill the 212 MB konqueror process, I can see the entire 212 MB of virtual memory deduct itself from disk cache, temporarily propagate to free memory, then disappear or propagate a bit to cached and buffers as the memory usage readjusts itself to a new equilibrium. So whether virtual memory is accurate or not in ps/top, I see most or all of it deduct itself from disk cache when the leaky konqueror process is finally killed.

 
At 7:36 AM, Anonymous Anonymous said...

Nice explanation, but Linux still suffers from Code Bloat. It used to run nicely on oldish systems, in fact that's how it caught on. Now you really need state of the art X86 hardware. I don't think this is a good idea - there is already a system optimized for those 99% of "other" users.

Hm, sure. Is that why it's running better than windows on 128MB of RAM on a 650MHz P3? Think not.

KDE runs fine, if a little laggy with antialiasing on (but I put that down to the 8MB ATi Rage card, not the software). Quite frankly, this myth that Linux is suddenly not an old box OS is nonsense. Slackware still supports machines with 4MB of RAM. Eat that, Windblows.

And as for the single copy of shared libs in memory: bet you Windows doesn't do it. Mainly because if you run two copies of IE, both suck up about 30MB of RAM. Either their code is so bloated that even with shared libs it takes that much RAM per IE instance, or they're using separate copies of the libs.

 
At 7:39 AM, Anonymous Anonymous said...

Uhm, isn't this the general UNIX-like memory management? What's the fuzz about Linux?

This is so, totally, old news.

Nils

 
At 7:42 AM, Anonymous Anonymous said...

Shared libraries only share backing store if the code can be run without modification - i.e., it was compiled as position-independent code. But if the code has to be modified as it's loaded, it will require its own (probably anonymous memory) swap pages.

And as noted earlier, it will still need virtual pages either way.

(PS - if you're getting hammered by TLB misses, try a larger page size, if possible)

 
At 7:42 AM, Anonymous Anonymous said...

Nice explanation, but Linux still suffers from Code Bloat. It used to run nicely on oldish systems, in fact that's how it caught on. Now you really need state of the art X86 hardware.


Really? Funny - I spent SuperBowl Sunday sitting on a couch with friends, and with a 366MHz Celeron laptop with 256MB of RAM, editing some photos I had shot in the woods earlier that day and reading up on some JSP stuff via Firefox.

IceWM and Rox-Filer ran fine, and I even ran Konqueror on top of that to browse my images folder with thumbnails.

 
At 7:43 AM, Anonymous Emery Berger said...

The comment about fragmentation of the heap is a bit off-base. First, the Linux allocator manages large objects with mmap, which can free memory chunks no matter where they are in the heap. Second, other memory allocators (BSD's, Hoard, Vam) use mmap exclusively, so they can always free up empty pages. Third, garbage collection invariably requires more space, and always swaps far more than malloc. Why? Because it periodically has to touch EVERY PAGE - including those swapped to disk - while it looks for garbage. For more info about GC and swapping, read our "Garbage Collection without Paging" or "Automatic Heap Sizing" papers; for more about page-oriented allocators, read one of our papers about Hoard or Vam (all linked from my web page).

 
At 7:46 AM, Anonymous Anonymous said...

"such as it's process id"

There's no apostrophe when you use "its" as a possessive. When you think you need the apostrophe just say to yourself "it is" or "it has" as you read the sentence and see if the sentence makes sense. If it doesn't, then remove the apostrophe.

 
At 7:46 AM, Anonymous Anonymous said...

nice txt, when is somebody explaining in that way that is easy to understand, you see it has good knowledge.

 
At 7:47 AM, Anonymous Anonymous said...

I'm not an expert, but from what I heard about Singularity (the new MS reserach OS) they're trying to overcome the actual design that allows many processes to share memory spaces, with a great drop on performance (that shouldn't be so bad with 4GB of RAM or god-knows-how-many-of-it we will have soon) but an improvement on system security and stability.
So, you all think this is the future (with big thnaks to those microchip guys) or it's a dead end?

 
At 8:07 AM, Blogger Marcos Dumay de Medeiros said...

I don't understand where the problem is. The same code would not take exactly the same number of TLB entries if the libraries weren't shared? Why are shared libraries expensive?

Ok, it would be better if we could reuse the TLB entries too, but it seems that we can't. Also, TLB isn't that expensive (At least on 32bit processors), if an architecture have very hard constrains, choose another one.

And we don't want to run apache with several processes on embebed systems either.

 
At 8:08 AM, Blogger Jacob Mathai said...

Nice explanation.

 
At 8:19 AM, Anonymous Anonymous said...

Boy, some of you are just plain retarded. "Linux" is whatever people want it to be. KDE is designed to run on more modern desktop computer systems - and it's going to take advantage of that fact. Why limit yourself to what a damned 386 can do, when you have machines that are leaps and bounds more powerful?

Some people just don't understand that when normal CompUSA desktops now come with 512MB RAM on the short side these days, we should make use of this memory and get more features, use it for more speed, etc. It boggles my mind. You show me one part of KDE or Gnome that is excessively "bloated" and I'll get on board. Until then, STFU.

So, if you don't like the fact that new Linux-compatible software isn't catering to your 386DX-33 in your closet, tough. Those days are over. But, at the same time, they're not, and that's what is so great about things. You can still build a perfectly usable (albiet very minimal) system on a 386. Just don't expect KDE.

 
At 8:38 AM, Anonymous eyolf said...

Very informative. But the two apps which most often are charged of bloat: firefox and openoffice, don't fare so well in this respect: Openoffice: mapped: 200540K writeable/private: 80480K shared: 34628K
and Firefox: mapped: 128024K writeable/private: 95096K shared: 576K
I guess that's the price for cross-platform compatibility...

 
At 8:40 AM, Anonymous Anonymous said...

99% of KDE users are *NOT* x86. You completely obliterated the x86_64 segment, and that more than makes up for 1% on it's own.

 
At 8:44 AM, Anonymous Anonymous said...

You show me one part of KDE or Gnome that is excessively "bloated" and I'll get on board. Until then, STFU

Well anonymous, KDE _used_ to run on a pentium with 32megs of RAM a few years back. It was a bit slow but it flew in 64 megs and that was my primary desktop. Now I can't even fsking install or start it in that amount of memory. Still not convinced you stupid ****? Bloat, bloat, bloat.

 
At 8:48 AM, Anonymous Anonymous said...

How about an article on how to disable commenting when the referrer is slashdot.org?

:-D

 
At 8:51 AM, Anonymous Stephen Kraus said...

Just for all you people strutting about with your x86 statements, realize this:
A LOT of corporations and major universities still run on Alpha and PowerPC class servers, and are quite happy with their choices. x86 is the most popular consumer CPU choice (mainly because of windoze) but I love my Alphas and G4/G5 systems. Just so you know, the Tricore PowerPC CPU in the Xbox 360 is also one of the most powerful CPUs to date. Not everything that isnt x86 compatible is slow and sluggish...

 
At 8:52 AM, Anonymous Anonymous said...

Eat that, Windblows.

And as for the single copy of shared libs in memory: bet you Windows doesn't do it.


a) Windows has done this for a very long time. Shared libraries are nothing new and, as was already said, pretty much any OS in even reasonably common usage today uses them
2) Run a non-embedded system (one including KDE and the like) on your 4M machine, then come back to us.
D) Your 'witty from 1995' use of 'Windblows' detracts from anything else you may say. Seeing that instantly drops your credibility to 'fanboi'. If you want to speak with us grownups, use grownup language (avoid M$, Winblows, and all those other fanboi 'witticisms'.

 
At 8:58 AM, Anonymous John Frakes said...

What about using lsof for determining sizes? It will list libraries separate and you get a calculated size of the components using memory/VFS space.

 
At 9:03 AM, Anonymous Anonymous said...

Great article. Unfortunately, my current running copy of firefox-bin shows writeable/private: 216212K. There's my main beef, why the fsck does firefox need 200MB of ram to run, with 10 tabs open?

 
At 9:06 AM, Anonymous Anonymous said...

ok.

 
At 9:25 AM, Blogger tweekgeek said...

Sweet post. Never would have looked further into memory management had it not been for this. Appreciate it.

 
At 9:29 AM, Anonymous John Berthels said...

If you factor out all of the parts that are shared between processes, you end up with the "writeable/private" total, which is shown at the bottom of the output. This is what can be considered the incremental cost of this process, factoring out the shared libraries.

I wrote a tool called Exmap which attempts to address this. i.e. you don't have to factor out the shared usage.

It uses a loadable kernel module to work out which pages are actually shared between processes. This should accurately account for demand-paging, copy-on-write and all that good stuff.

If 3 processes all have a page mapped, that page accounts (PAGE_SIZE/3) to each process. It gives figures on various things, including RAM used, (RAM+swap) used, writable, etc.

It also allows you to break things down by proc/file/ELF section/ELF symbol.

It's a bit of a 'raw' tool, but should be usable by anyone who understands the issues you describe in your post :-)

 
At 9:43 AM, Anonymous Anonymous said...

Firefox needs lots of memory because it needs to cache in memory a rendered copy of every tabs (or at least enough information to re-render it). Pictures are big, expecially once they're decompressed, that 20K JPEG might be 2M once converted back to 24-bit bitmap.

 
At 10:02 AM, Anonymous Anonymous said...

Shocking how few people in the comments, apparently programmers, knew about this.

As far as bloated code goes, that is just an evolutionary side-effect of the need for more and more features in the base libraries. Those new features could be put into new libraries; but then we would get complaints about too many libraries.

 
At 10:03 AM, Anonymous Anonymous said...

If this was an article about MS Windows on Slashdot it would read:
Windows lies about Memory Usage!!

 
At 10:25 AM, Anonymous Anonymous said...

About the comments:
99% of users do what? Wait a moment, I don't care. It's a slippery slope to say that 99% of users don't use some particular configuration. They used to say something similar about computer users in general - 99% don't use Linux.

About the article itself:
I'm basically now just even less likely to try to figure out what those (practically random) numbers in ps are. It seems what is really needed is a simplified ps (and I don't mean GUI of the same incomprehensible stuff).

 
At 10:38 AM, Anonymous Anonymous said...

So... if I write a program that will inject shellcode into the printf function (contained in libc), will all programs that use printf be affected?

 
At 11:10 AM, Anonymous Anonymous said...

an increase in features != bloat. Because modern DEs do more, they need more. If you want the functional equavilent to you 386 with kde1 then linux can still happily do it, you just won't be using KDE or GNOME

 
At 11:26 AM, Anonymous Anonymous said...

A nice article. I didn't know the pmap tool either (like reported by matt darby).

Thank you!

- Anonymous coward

 
At 11:27 AM, Anonymous Anonymous said...

First, many linux users run on non x86 operating systems. Thats the whole point to linux or even some of the bsds. You get a great system thats portable. I have two linux installs at home, one on my iBook G4 and one on my x86 desktop (dual xeon). Sadly linux is FASTER on the ibook. That even has gnome running on it. As for kde, i don't think its that bloated. It does a lot of caching just like firefox does if it has the room (ram). KDE is still smaller than gnome if you disable its eye candy. (and gnomes) Both are supposed to compete with windows as graphical environments. The price is cpu and memory. They are doing a windows 3.x here.. running a graphical environment on top of a command line system. There is overhead to doing that.

As for cpus... I have 1 sparc, 5 ppc chips, and 4 x86 chips (3 intel, 1 amd) in my home right now. My wife and I are both cs people and love diversity in platforms. Each system has its benefits and uses for certain tasks.

Linux is big on POWER because of IBM and i've met at lot of people who run linux on sparcs. (personally i prefer the speed of netbsd on the sparc) Enterprise linux is about 64 bit systems.. AMD64, Sparc and POWER.

 
At 11:30 AM, Anonymous Anonymous said...

Shocking how few people in the comments, apparently programmers, knew about this.

I am not shocked. Many so-called 'programmers' do not have a clue how a computer operates - much less nuances of various operating systems. Many learn a particular paradigm/programming framework and leave it at that.

I have always loved computers and programming and find it hard to understand anyone who enters this field just for the money without an interest in going beyond being 'just a programmer'. Being valuable to your company (and thus not becoming outsourced) goes hand-in-hand with constantly improving your understanding of computer science in general - and the specifics of your area of expertise in detail.

It is like the folks we see on 'American Idol' that swear up-and-down how great they sing - yet even a tin-ear can hear how out of tune they are when they do. Sometimes it is hard to be self critical - to really hear how badly you sing - but is needed in any high performance endeavor, be it music or developing and integrating computer software and hardware systems.

 
At 11:33 AM, Blogger Diabolic Preacher said...

thanks for writing about the pmap command. :)

 
At 12:01 PM, Anonymous Anonymous said...

Firefox needs lots of memory because it needs to cache in memory a rendered copy of every tabs (or at least enough information to re-render it). Pictures are big, expecially once they're decompressed, that 20K JPEG might be 2M once converted back to 24-bit bitmap.

Another problem with Firefox on startup is that, using GNOME libraries, it reads an almost ridiculous amount of gnome xml config files, over 700 IIRC.

I can't always agree to the "features != bloat" argument. Who needs a flashing icon at the bottom right edge of the mouse pointer in KDE when starting apps, to give one example.

Well, anyway, here's a comparison of our most beloved editors (name private/mapped/number of libs)

joe 728K/2624K/24
vim 836K/3804K/37
mcedit 1292K/5884K/52
emacs 3912K/11472K/82
kate 1392K/24676K/298

And despite libs being shared (note that they also need -fpic / -fPIC during compilation, otherwise it's for nothing!), considering just one process, should it run by itself (kate for example starts a handful of KDE services if not already started), it loads the most libraries. Though not all of these might be immediately mapped into memory, when you use all 'features' of it, it uses the most.

 
At 12:13 PM, Anonymous Anonymous said...

See also

"http://sunsite.bilkent.edu.tr/pub/linux/gnome/GDP/papers/whitepapers/MemoryUsage/MemoryUsage.ps"

for an explination of memory usage by Miguel De Icaza of gnome fame.

In short, best way to look at memory is rss - share (as reported by top or /proc/$pid/statm see man 5 proc)


Eli Criffield

 
At 12:14 PM, Anonymous Anonymous said...

Anonymouse said:

Firefox needs lots of memory because it needs to cache in memory a rendered copy of every tabs (or at least enough information to re-render it). Pictures are big, expecially once they're decompressed, that 20K JPEG might be 2M once converted back to 24-bit bitmap.

That's just dumb.

 
At 12:20 PM, Anonymous Anonymous said...

"They are doing a windows 3.x here.. running a graphical environment on top of a command line system. There is overhead to doing that."

Have you ever taken a look at this "overhead" you speak of? The overhead of the shell that is opened for your session is minimal and very little else is running simply "because" of the gui running on the command line. Apple was wrong to force people into a certain gui and M$ did it to follow Apple in the "usability" category.

BTW, you do realize that the "3.1 method" has been going on for over 20 years with CDE and it's predacessors, don't you? It's not like Microsoft and Apple invented the gui... they simply fooled people into thinking that it requires an OS upgrade to get a new gui.

 
At 12:41 PM, Anonymous Anonymous said...

"Nice explanation, but Linux still suffers from Code Bloat. It used to run nicely on oldish systems, in fact that's how it caught on. Now you really need state of the art X86 hardware. I don't think this is a good idea - there is already a system optimized for those 99% of "other" users."

Maybe you should do a little searching on http://distrowatch.com/ for a kernel that will run smoother with your hardware... Software will continue to evolve with hardware (there’s no stopping that).

My advice would be to save your Fast Food salary and get a 1Ghz or better computer off of http://www.retrobox.com/rbwww/home/search_results_pc_computers.asp?bin_id=world for $200.00 or less.

You seriously need to stop and open your eyes… There are so many versions of Linux tailored to do so many jobs along with a plethora of documentation. If you can’t find a distribution out there to do what you need / want, why not take some initiative, do some research and give a little back to the community?

Complaints like yours at this point in the game show nothing less then lazy and willful ignorance. So join the battle and help solve the problem.

 
At 12:41 PM, Anonymous Anonymous said...

It may be dumb, but people complained about it being too slow flipping between pages. They can only get the renderer to work so fast.

 
At 12:54 PM, Anonymous Anonymous said...

"marginal" does not mean "actual" and "needless to say" does not mean "regardless". Learn to communicate before starting a blog.

 
At 1:12 PM, Blogger meowsqueak said...

Amusing - "marginal" can be an economic term which means "the difference between this and the next item" so it's usage is perfectly valid.

Your other comment is just an irrelevant diversion.

 
At 1:15 PM, Anonymous Anonymous said...

TANSTAAFL.

Want the whizz-bang pretty effects that the most modern versions of GNOME and KDE provide? Get with some hardware that wasn't considered obsolete pre-Y2K.

Want to use hardware that may or may not be considered modern-day state-of-the-art? Find an older version of KDE, GNOME..., or fire up a browser and seek out one of the many other quite capable desktop managers out there.

In other words, folks that want something for nothing ussually end up with neither.
//TB

 
At 2:00 PM, Anonymous Anonymous said...

Thanks anonymous for the mention of the SZ parameter. I've never noticed that before. Funny, now I am able to determine that firefox is tiny compared to amarok ;)

As for the original article, sure it's not new knowledge nor is it linux specific, but it is nice when people explain some of these more difficult to understand bits in a way that's accessible to people new to linux/unix systems.

 
At 2:00 PM, Anonymous Anonymous said...

such as it's process id --> such as its process id

 
At 2:21 PM, Anonymous Anonymous said...

Thanks for the food for thought. I'll think twice after I use ps to view the memory being crunched by my processes.

 
At 2:26 PM, Anonymous Anonymous said...

Linux isn't code bloated.
Linux is ONLY the kernel. As for KDE being bloated I'd disagree KDE packs many features a Desktop Envioment should have.
I expect my DE to use my hardware I wouldn't purchase a 3Ghz machine to run windows 3.1 you'd want to run the latest stuff on it.
Its not KDE's fault you can't keep up to date if you want to use old hardware do so but don't expect to run the latest version of KDE or Gnome.
XFWM is a suitable DE for older hardware and you always have the choice of using a WM like openbox.
To say a linux/gnu based desktop is bloated is crazy and clearly shows you know nothing about linux.
Do you expect to run windows xp on a 386?
NO! you don't....
So why expect the latest version of KDE or gnome to work....
If you can not find a distro to match your hardware then you ain't looking in the right places.....
Have a look at Damn Small Linux....a update to date distro which is tiny now go find me a os that can do the same...
Don't say a bsd, solaris or linux... find me a copy of windows or mac os that will run as they are the only other OS around.

 
At 2:48 PM, Anonymous Anonymous said...

Yes. Its Bloated.

now fluxbox. mmm.. skinny.

 
At 3:23 PM, Blogger Anonymous said...

Ice Ice window manager... let the flame wars begin!

 
At 3:25 PM, Anonymous Anonymous said...

I was hoping to see some explanation of what RAM is used by the running processes and what is used for file system cache. From what I've seen, Linux (probably doing "the right thing") will use all avail RAM for file system cache and then it is hard to tell if you need to add RAM or not.

 
At 5:25 PM, Anonymous Anonymous said...

just use FreeBSD... :P

 
At 6:03 PM, Anonymous Anonymous said...

Ratpoison is even less bloated. Should work fine on that old 486 32MB machine you got there.
Also check out OpenBSD while you're at it. Very small footprint and very nice. Great documentation too, the best in fact.

 
At 6:22 PM, Anonymous Anonymous said...

"I was hoping to see some explanation of what RAM is used by the running processes and what is used for file system cache."

If you run the "System Monitor" kicker applet in KDE, it will give you a small bar graph of those two things (plus "buffers") as a percent of your total RAM. I find it very handy to monitor my in-use memory (and cpu and swap) at a glance.

 
At 10:28 PM, Blogger theworld said...

A very useful article put in a very simple way ! You mentioned a very good point, "use applications that use same libraries".

 
At 1:17 PM, Anonymous Mantar said...

I was hoping to see some explanation of what RAM is used by the running processes and what is used for file system cache.

Another way to do that is with the command "free" -- you'll want the second line, which sticks the buffers and cache in the available column.

 
At 6:59 PM, Anonymous Anonymous said...

In additional to "free", you can also just "cat /proc/meminfo". Of course, that level of detail is not for the faint of heart!

And if even that isn't enough, take a look at /proc/slabinfo, where the amount of memory for each subsystem inside the kernel is recorded! It'll tell you how may inodes are allocated (and in-use) for each filesystem type (but not for each specific file system mount point).

As an instructor who teaches Linux Internals, I think the article is a good overview and starting point for understanding Linux memory allocation in applications.

The comments concerning the Translation Lookaside Buffer (TLB) are a bit disingenuous (the TLB caches 32-bit logcal address translations into 32-bit (or 36-bit) physical addresses on x86 hardware, so if a vm_area in one process refers to the same pages as a vm_area in another process, the TLB entries can be shared; however, Linux doesn't currently try to track this and just clears the entire TLB on process context switches -- well, at least on the 2.6.9 kernel that I last looked at to explore this topic), and concerning the use of -FPIC: all standard libraries that I know of are compiled using the "position-independent code" options.

One individual mentioned management of page tables; but page tables are always a constant size (based on the number of frames of memory available minus those frames used to hold the page tables themselves). At least, a constant size when the page is a given size (typically 4K on x86, although there is a compile-time option for 8K page sizes). I'm not familiar with the impact of HugeTLB pages, as I haven't studied them in detail and I wouldn't want to knowingly spread misinformation. (And again, most of this info goes back to 2.6.9, but some is more recent.)

I'm glad someone mentioned that Linux (and other modern Un*ces) use an mmaped scheme for heap space; most people don't know that. :) What's really weird is the sequence of address space ranges that are allocated to contain the heap! ;) The libc malloc() implementation jumps all over the place within the process's address space and maps /dev/zero in order to allocate heap space. And of course, allocating address space in a process doesn't require any physical memory to necessarily exist, so a "writable/private" of 20MB might have as little as 100KB of actually allocated virtual-to-physical memory pages. I'm not talking about paged out memory, but "ever allocated" memory.

Anyway, if you want more information, pick up a copy of Robert Love's book on Linux Kernel Development, or the download the Gorman book, Understanding the Linux Virtual Memory Manager (the PDF is available elsewhere; google for it).

 
At 8:34 PM, Anonymous Anonymous said...

Someone Posted:It's a slippery slope to say that 99% of users don't use some particular configuration. They used to say something similar about computer users in general - 99% don't use Linux. Well now that's a lie now isn't it. Several years ago Linux passed MAC in desktop use --about 5%. So saying 99% don't use Linux is a lie by at least 4%, likely more. As for servers, I can argue more than 5% use Linux --by a factor of about 5x--. For supercomputer use, I can wander over to top500.org and the the numbers broken down by OS: Linux runs on 371 of the 500 fastest machines in the world, including all in the top 5. Windows has 1 machine (ranked in 310th spot). It's a machine sponsored by Microsoft so that they can say they have a machine on the list. Where did you get 99% from? Certainly not from those who know about comptuers and performance, not by a danmed longshot!

 
At 1:16 PM, Anonymous Charles said...

Seriously, anyone here know if there is an opensource project that aim at tracking howmuch memorie programs really use?

pmap seams a bit weird to be user friendly for a non programmer.

later

 
At 4:19 AM, Anonymous Anonymous said...

Very NICE! =)

Thanks for this explanation.

 
At 9:29 AM, Anonymous Anonymous said...

Nice description. I have the same issue with memory usage reported by Windows Task Mananger.

 
At 3:28 PM, Anonymous anonymous100% said...

I came here trying to figure out why "ps aux" does not add to 100% for %MEM. I get a grand total of 4.2% which isn't remotely close to the 90% reported by free (used/total).
I get about 21% if I add the VSZ column.

Am I missing processes? How can I get a list that adds to 100%?

 
At 10:13 PM, Anonymous Anonymous said...

Excellent article, very useful.. thanks for posting. And such useful comments as well :-)

 
At 10:22 AM, Anonymous Anonymous said...

Does anyone know why pmap reports different values for the "total kB" field in different invocations of the same executable? I see this behavior on Linux but not on Solaris.

 
At 6:54 PM, Anonymous Anonymous said...

This comment has been removed by a blog administrator.

 
At 5:12 AM, Blogger Jon Dowland said...

Interesting article. I take issue however with your advice about sticking to apps from one desktop environment. For the vast majority of people, the increased memory usage for having the "redundant" libraries resident is not an issue. Far more important is the diversity of choosing the best app for the job, disregarding which camp it came from, and not getting entrenched in the anti-choice "one desktop to rule them all" mindset.

 
At 9:01 AM, Blogger Pádraig Brady said...

I've a script to accurately list programs' ram usage here:
http://www.pixelbeat.org/scripts/ps_mem.py

You can't assume all mem associated with a shared lib is shared. Also pmap reports just virtual mem large parts of which are likely to be paged to disk forever for shared libs.

My script uses the more accurate /proc/.../smaps available in newer kernels to determine the mem used for a process. Compare and contrast:

$ pmap $$ | grep libc
00a13000 1168K r-x-- /lib/libc-2.3.5.so
00b37000 8K r-x-- /lib/libc-2.3.5.so
00b39000 8K rwx-- /lib/libc-2.3.5.so

$ grep libc -A6 /proc/$$/smaps
00a13000-00b37000 r-xp 00000000 08:06 513762 /lib/libc-2.3.5.so
Size: 1168 kB
Rss: 492 kB
Shared_Clean: 492 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 0 kB
00b37000-00b39000 r-xp 00124000 08:06 513762 /lib/libc-2.3.5.so
Size: 8 kB
Rss: 8 kB
Shared_Clean: 4 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 4 kB
00b39000-00b3b000 rwxp 00126000 08:06 513762 /lib/libc-2.3.5.so
Size: 8 kB
Rss: 8 kB
Shared_Clean: 0 kB
Shared_Dirty: 0 kB
Private_Clean: 0 kB
Private_Dirty: 8 kB

 
At 5:32 AM, Anonymous Anonymous said...

Nice article. Better than the repeated ps man pages floating all over the web. Keep up the good work.

--SriRam

 
At 6:03 PM, Anonymous Anonymous said...

Hi!

Nice article Devin.
An anonymous above referenced this link:

http://sunsite.bilkent.edu.tr/pub/linux/gnome/GDP/papers/whitepapers/MemoryUsage/MemoryUsage.ps

I read it also, and I really recommend it to extend this article.

Have a nice day :)

 
At 7:58 AM, Anonymous Anonymous said...

Going back to the pmap output, you said:

One important thing to note about the output is that each shared library is listed twice; once for its code segment and once for its data segment. The code segments have a mode of "r-x--", while the data is set to "rw---".

I'm getting three lines for each library:

0000002a96a7d000 3008 r-x-- 0000000000000000 068:00002 libnnz10.so
0000002a96d6d000 1024 ----- 00000000002f0000 068:00002 libnnz10.so
0000002a96e6d000 704 rw--- 00000000002f0000 068:00002 libnnz10.so


The one not memtioned is the one with no permissions - what does that represent?

Also, while I'm posting, What is this line?

0000007fbffe4000 112 rwx-- 0000007fbffe4000 000:00000 [ stack ]

..should that be considered part of the executable, or is it the kernel stack and therefore shared?

Thanks.

 
At 4:21 AM, Anonymous Used car said...

Nice description. I have the same issue with memory usage reported by Windows Task Mananger.

 
At 11:46 AM, Blogger Zaara said...

This comment has been removed by a blog administrator.

 
At 2:29 AM, Blogger vasavi said...

Good description given on Linux Processes memory Usage.

So what would be the better way to check the Memory Usage of the Process.

~Vasavi

 
At 1:24 PM, Anonymous annerose said...

These comments have been invaluable to me as is this whole site. I thank you for your comment.

 
At 5:32 PM, Anonymous insurance policies said...

Thank you very much for this information.

 
At 5:32 AM, Anonymous beow said...

I use this command:

$ free | awk '/^\-\/\+/ {print $3 }'

before and after starting a program to get an opinion on the memory usage of a program. Uses the "free" command that displays used and free memory. The "awk" filter displays only the total used memory after subtracting cache/buffer memory. Would this give a good measure of the programs memory consumption?

 
At 9:32 AM, Blogger Manikandan said...

Hi nice blog .I need to post resumes .can anybody send links of that sites.
Thank you.........

 
At 5:04 AM, Anonymous games said...

This comment has been removed by a blog administrator.

 
At 5:43 AM, Anonymous Sai Narasimha Reddy said...

Thanq very much.

Nice explanation

 
At 10:41 AM, Anonymous Anonymous said...

Why don't just distribute the size of shared library between it's users?
Example:
Program A:
4 -- Own
16 -- libA.so
66 -- libB.so

Program B:
6 -- Own
66 -- libB.so
12 -- libC.so

Conslusion:
Program A: 4+8+33 = 45
Program B: 6+33+12 = 51

~~~~ _Vi

 
At 8:24 PM, Blogger laptop battery said...

[...]resource[...]

 
At 7:39 AM, Anonymous Matrix said...

What do you mean?

 
At 3:34 AM, Blogger laptop battery said...

If you need the battery,you can visit here.

 
At 3:04 AM, Anonymous Anonymous said...

One World, One Dream
As a theme of Beijing 2008 Olympic Games, Green Olympics means to prepare it in accordance with the principle of
Dvd Box Wholesale Tin Containers Cd Box Supplier Pet Carrier Leather handbags Leather Handbag Flash Drive| Flash Card| Memory Module| Memory Card|USB Drive| SUB Contracting

 
At 10:23 PM, Blogger Sanjeev said...

Good article and reference.

 
At 3:22 PM, Anonymous Anonymous said...

I'm running a Java Program where I set the maximum heap size to be 2M. But then I saw the "writeable/ private" number is around 200 MB. This is weird?

So just to confirm: the "writeable/ private" is really the true cost of running a process in linux machine?

Thanks in advance for the help.

 
At 10:10 PM, Anonymous wayno said...

Linux is able to use a great trick: it will load a single copy of the shared libraries into memory and use that one copy for every program that references it.

Stolen from the RSX-11 play book.

Wayno

 
At 3:34 AM, Anonymous Anonymous said...

医院開業
システムキッチン
競馬
東京 土地
アクサダイレクト
サイドビジネス

 
At 12:03 PM, Blogger uiqbal said...

I have written a test program which gets the total virtual memory and then keeps on allocating 100 MB blocks. The problem is that according to the link given below a user can have uptill 3GB of VM but i have been only able to allocate uptil 2.73 gigs and my vm is also close to that in /proc/pid/statm and /proc/pid/status.
What could be the problem?
where is the remaining 0.3 GB?


The link for 3GB.
http://kerneltrap.org/node/2450/7217

 
At 11:47 PM, Blogger idea stack said...

It is a nice article to know about the memory usage of linux. Thanks for your valuable information.

 
At 11:49 PM, Blogger idea stack said...

I have really got amazed of this information about the memory usage on Linux. It is useful for everyone.
cheap vps

 
At 5:54 AM, Blogger jenny said...

There is a great explanation for everyone to understand the memory usage on Linux. I hope this will really make a huge impact on everyone.Thanks for sharing this information.
by
hidden object games

 
At 2:18 AM, Blogger sumant said...

Linux comes with different command to check memory usage. free command displays the total amount of free and used physical and swap memory in the system, as well as the buffers used by the kernel.

vmstat command reports information about processes, memory, paging, block IO, traps, and cpu activity.

Then you can use top command which provides a dynamic real-time view of a running system. It can display system summary information as well as a list of tasks currently being managed by the Linux kernel.



Recently I just came across a good article on "Linux"
Here is its link.

 
At 4:07 PM, Blogger Aleksandr Levchuk said...

Let me guess: OOM killer is fooled by this to.

Please correct me...

 

Post a Comment

<< Home