Friday, 31 August 2012

MEMORY LEAKAGE IN JAVA


Memory Leakage in JAVA
Topics covered in the document :
Ø  What is Memory leakage ?
Ø  How this occurs?
Ø  Finding the cause of memory Leakage…?

What is Memory Leakage ?
A memory leak, in computer science (or leakage, in this context), occurs when a computer program consumes memory but is unable to release it back to the operating system. In object-oriented programming, a memory leak happens when an object is stored in memory but cannot be accessed by the running code. A memory leak has symptoms similar to a number of other problems and generally can only be diagnosed by a programmer with access to the program source code; however, many people refer to any unwanted increase in memory usage as a memory leak, though this is not strictly accurate from a technical perspective.
Because they can exhaust available system memory as an application runs, memory leaks are often the cause of or a contributing factor to software aging.
A memory leak is the gradual loss of available computer memory when a program (an application or part of the operating system) repeatedly fails to return memory that it has obtained for temporary use. As a result, the available memory for that application or that part of the operating system becomes exhausted and the program can no longer function. For a program that is frequently opened or called or that runs continuously, even a very small memory leak can eventually cause the program or the system to terminate. A memory leak is the result of a program bug.
Some operating systems provide memory leak detection so that a problem can be detected before an application or the operating system crashes. Some program development tools also provide automatic "housekeeping" for the developer. It is always the best programming practice to return memory and any temporary file to the operating system after the program no longer needs it.
** When a system does not correctly manage its memory allocations, it is said to leak memory. A memory leak is a bug. Symptoms can include reduced performance and failure.

How Memory Leakage occurs in JAVA?
Most programmers know that one of the beauties of using a programming language such as Java is that they no longer have to worry about allocating and freeing memory. You simply create objects, and Java takes care of removing them when they are no longer needed by the application through a mechanism known as garbage collection. This process means that Java has solved one of the nasty problems that plague other programming languages -- the dreaded memory leak. Or has it?
But, before getting into the depth of the reasoning of the Memory Leakage in JAVA, lets see how Garbage Collector works?
Working Of Garbage Collector..
Lets understand what actually is Garbage Collector??
Few important points about garbage collection in java:
1) objects are created on heap in Java  irrespective of there scope e.g. local or member variable. while its worth noting that class variables or static members are created in method area of Java memory space and both heap and method area is shared between different thread.
2) Garbage collection is a mechanism provided by Java Virtual Machine to reclaim heap space from objects which are eligible for Garbage collection.
3) Garbage collection relieves java programmer from memory management which is essential part of C++ programming and gives more time to focus on business logic.
4) Garbage Collection in Java is carried by a daemon thread called Garbage Collector.
5) Before removing an object from memory Garbage collection thread invokes finalize () method of that object and gives an opportunity to perform any sort of cleanup required.
6) You as Java programmer can not force Garbage collection in Java; it will only trigger if JVM thinks it needs a garbage collection based on Java heap size.
7) There are methods like System.gc () and Runtime.gc () which is used to send request of Garbage collection to JVM but it’s not guaranteed that garbage collection will happen.
8) If there is no memory space for creating new object in Heap Java Virtual Machine throws OutOfMemoryError or java.lang.OutOfMemoryError heap space
9) J2SE 5(Java 2 Standard Edition) adds a new feature called Ergonomics goal of ergonomics is to provide good performance from the JVM with minimum of command line tuning.

When an Object becomes Eligible for Garbage Collection?

An Object becomes eligible for Garbage collection or GC if its not reachable from any live threads or any static refrences in other words you can say that an object becomes eligible for garbage collection if its all references are null. Cyclic dependencies are not counted as reference so if Object A has reference of object B and object B has reference of Object A and they don't have any other live reference then both Objects A and B will be eligible for Garbage collection. 
Generally an object becomes eligible for garbage collection in Java on following cases:
1) All references of that object explicitly set to null e.g. object = null
2) Object is created inside a block and reference goes out scope once control exit that block.
3) Parent object set to null, if an object holds reference of another object and when you set container object's reference null, child or contained object automatically becomes eligible for garbage collection.
4) If an object has only live references via WeakHashMap it will be eligible for garbage collection. 

[Note : One distinction to be clear on is the difference between a WeakReference and a SoftReference.
Basically a WeakReference will be GC-d by the JVM eagerly, once the referenced object has no hardreferences to it. A SoftReferenced object on the other hand, will tend to be left about by the garbage collector until it really needs to reclaim the memory.
A cache where the values are held inside WeakReferences would be pretty useless (in aWeakHashMap, it is the keys which are weakly referenced). SoftReferences are useful to wrap the values around when you want to implement a cache which can grow and shrink with the available memory]

Types of Garbage Collector in Java

Java Runtime (J2SE 5) provides various types of Garbage collection in Java which you can choose based upon your application's performance requirement. Java 5 adds three additional garbage collectors except serial garbage collector. Each is generational garbage collector which has been implemented to increase throughput of the application or to reduce garbage collection pause times.

1) Throughput Garbage Collector: This garbage collector in Java uses a parallel version of the young generation collector. It is used if the -XX:+UseParallelGC option is passed to the JVM via command line options . The tenured generation collector is same as the serial collector.

2) Concurrent low pause Collector: This Collector is used if the -Xingc or -XX:+UseConcMarkSweepGC is passed on the command line. This is also referred as Concurrent Mark Sweep Garbage collector. The concurrent collector is used to collect the tenured generation and does most of the collection concurrently with the execution of the application. The application is paused for short periods during the collection. A parallel version of the young generation copying collector is sued with the concurrent collector. Concurrent Mark Sweep Garbage collector is most widely used garbage collector in java and it uses algorithm to first mark object which needs to collected when garbage collection triggers.

3) The Incremental (Sometimes called train) low pause collector: This collector is used only if -XX:+UseTrainGC is passed on the command line. This garbage collector has not changed since the java 1.4.2 and is currently not under active development. It will not be supported in future releases so avoid using this and please see 1.4.2 GC Tuning document for information on this collector.
Important point to not is that -XX:+UseParallelGC should not be used with -XX:+UseConcMarkSweepGC. The argument passing in the J2SE platform starting with version 1.4.2 should only allow legal combination of command line options for garbage collector but earlier releases may not find or detect all illegal combination and the results for illegal combination are unpredictable. It’s not recommended to use this garbage collector in java.

Full GC and Concurrent Garbage Collection in Java

Concurrent garbage collector in java uses a single garbage collector thread that runs concurrently with the application threads with the goal of completing the collection of the tenured generation before it becomes full. In normal operation, the concurrent garbage collector is able to do most of its work with the application threads still running, so only brief pauses are seen by the application threads. As a fall back, if the concurrent garbage collector is unable to finish before the tenured generation fill up, the application is paused and the collection is completed with all the application threads stopped. Such Collections with the application stopped are referred as full garbage collections or full GC and are a sign that some adjustments need to be made to the concurrent collection parameters. Always try to avoid or minimize full garbage collection or Full GC because it affects performance of Java application. When you work in finance domain for electronic trading platform and with high volume low latency systems performance of java application becomes extremely critical an you definitely like to avoid full GC during trading period.

Summary on Garbage collection in Java

1) Java Heap is divided into three generation for sake of garbage collection. These are young generation, tenured or old generation and Perm area.
2) New objects are created into young generation and subsequently moved to old generation.
3) String pool is created in Perm area of Heap, garbage collection can occur in perm space but depends upon JVM to JVM.
4) Minor garbage collection is used to move object from Eden space to Survivor 1 and Survivor 2 space and Major collection is used to move object from young to tenured generation.
5) Whenever Major garbage collection occurs application threads stops during that period which will reduce application’s performance and throughput.
6) There are few performance improvement has been applied in garbage collection in java 6 and we usually use JRE 1.6.20 for running our application.
7) JVM command line options –Xmx and -Xms is used to setup starting and max size for Java Heap. Ideal ratio of this parameter is either 1:1 or 1:1.5 based upon my experience for example you can have either both –Xmx and –Xms as 1GB or –Xms 1.2 GB and 1.8 GB.
8) There is no manual way of doing garbage collection in Java.

Determining if an application has memory leaks?
Here is a small HOWTO regarding the tools that help find memory leaks. 

Note: Use the latest JDK 6, because it has the latest tools, with 
lots of bugfixes and improvements. All the later examples assume that 
JDK6's bin directory is in the PATH. 

Step 1. Start the application. 

Start the application as you usually do: 

java -jar java_app.jar 

Alternatively, you could start java with hprof agent. Java will run 
slower, but the huge benefit of this approach is that the stack traces 
for created objects will be available which improves memory leak 
analysis greatly: 

java \ 
-Dcom.sun.management.jmxremote 
-Dcom.sun.management.jmxremote.port=9000 \ 
-Dcom.sun.management.jmxremote.authenticate=false \ 
-Dcom.sun.management.jmxremote.ssl=false \ 
-agentlib:hprof=heap=dump,file=/tmp/hprof.bin,format=b,depth=10 \ 
-jar java_app.jar
 

When the application is up, perform various actions that you think might lead 
to memory leaks. 

You might use jconsole from JDK 6 to see the memory consumption graph 
to have a clue whether memory leak is present or not: 

jconsole 

It will present a dialog with a list of java apps to connect to. Find 
the one with java_app.jar and connect. 

For example, if you open some documents in your app, the memory 
graph could rapidly go up. If closing the docs and invocation of 
full garbage collection did not bring the memory back to normal level, 
there is probably a leak somewhere. 

Jconsole allows to invoke full GC providing nice button just for that. 

Step 2. Find the application pid. 

Find out the application's process id via: 

jps 

It will print something like: 

15976 java_app.jar 
7586 startup.jar 
22476 Jps 
12248 Main 
5437 Bootstrap 

In our case the pid is 15976. 

Step 3. Dump the heap into file. 

Dump heap into the file: 

jmap -dump:format=b,file=/tmp/java_app-heap.bin 15976 

We just told jmap to dump the heap into /tmp/java_app-heap.bin file, 
in binary from (which is optimized to work with large heaps). The 
third parameter is the pid we found in Step 2. 

Alternatively, if you started java with hprof agent, you could just 
use Ctrl-\ on Solaris/Linux or Ctrl-Break on Windows to dump heap into 
the file, specified in hprof agent arguments. 

Step 4. Visualize the heap. 

Use jhat tool to visualize the heap: 

jhat -J-Xmx326m /tmp/java_app-heap.bin 

Jhat will parse the heap dump and start a web server at port 7000 

Connect to Jhat server. 

Point your browser to: 

http://localhost:7000 

And start investigating. :) 

Jhat allows you to see what obects are present in the heap, who has 
references to those objects, etc. 

Here are some tips: 

* Investigate _instances_, not _classes_ 

* Use the following URL to see the instances: 
http://localhost:7000/showInstanceCounts/ 
v  The other solution we have is to use a memory profiler that tracks allocations. Take a look at JProfiler - their "heap walker" feature is great, and they have integration with all of the major Java IDEs. It's not free, but it isn't that expensive either ($499 for a single license) - you will burn $500 worth of time pretty quickly struggling to find a leak with less sophisticated tools.

v  Look for objects and variables which are no longer needed, and free them. Have a habit to use finalize method and free any open file handles. Look for deadlock scenarios, and thread deadlock scenarios in the multi-threading codes.
v  There's a new tool called Plumbr that attempts to do the whole job for you. You just need to attach it to your JVM and when it finds a leak, it lists the leaking objects, where they are created (the exact line in source code) and where the leaked objects are held.

While it is designed to help you prevent Java memory leaks entirely, you can also use it to pinpoint an existing leak - provided that you know how to reproduce it.
Another way to determine If My Program Leaking Memory?
Not every OutOfMemoryError alert indicates that a program is suffering from a memory leak. Some programs simply need more memory to run. In other words, some OutOfMemoryError alerts are caused by the load, not by the passage of time, and as a result they indicate the need for more memory in a program rather than a memory leak.

To distinguish between a memory leak and an application that simply needs more memory, we need to look at the "peak load" concept. When program has just started no users have yet used it, and as a result it typically needs much less memory then when thousands of users are interacting with it. Thus, measuring memory usage immediately after a program starts is not the best way to gauge how much memory it needs! To measure how much memory an application needs, memory size measurements should be taken at the time of peak load—when it is most heavily used.

The graph below shows the memory usage in a healthy Java application that does not suffer from memory leaks, with the peak load occurring around 10 AM and application usage drastically decreasing at 5 PM. Naturally, the peak load on business applications often correlates with normal business hours.
memory_leaks_figure7

The application illustrated by the chart above reaches its peak load around 10 AM and needs around 900MB of memory to run. This is normal behavior for an application suffering from no memory leaks; the difference in memory requirements throughout the day is caused solely by the user load.

Now, let's suppose that we have a memory leak in the application. The primary characteristic of memory leaks is that memory requirements increase as a function of time, not as a function of the load. Let's see how the application would look after running for a few days with a memory leak and the same peak user loads reached around 10 AM every day:

memory_leaks_figure8

Because peak loads on the system are similar every morning but memory usage is growing over a period of a few days, this picture indicates a strong possibility of memory leaks. If the program eventually started suffering from OutOfMemory exceptions, it would be a very strong indication that there's a problem with memory leaks. The picture above shows a memory leak of about 100MB per day.

Note that the key to this example is that the only thing changing is the amount of time the system is up—the system peak load doesn't change over time. This is not the case for all businesses. For example, the peak load for a tax preparation service is seasonal, as there are likely more users on the system in April than July.

There is one special case that should be noted here: a program that needs to be restarted periodically in order to prevent it from crashing with an OutOfMemoryError alert. Imagine that on the previous graph the max memory size was 1100MB. If the program started with about 900MB of memory used, it would take about 48 hours to crash because it leaks about 100MB of memory per day. Similarly, if the max memory size was set to 1000MB, the program would crash every 24 hours. However, if the program was regularly restarted more often than this interval, it would appear that all is fine.

Regularly scheduled restarts may appear to help, but also might make "upward sloping memory use" (as shown in the previous graph) more difficult to notice because the graph is cut short before the pattern emerges. In a case like this, you'll need to look more carefully at the memory usage, or try to increase the available memory so that it's easier to see the pattern.
Monitoring the JAVA Process :
The following approach works for any Java process, including standalone clients as well as application servers like JBoss and servlet containers like Tomcat. It is based on starting the Java process with JMX monitoring enabled and attaching with the JMX monitoring tools. We'll use Tomcat in the following example.

To start Tomcat or the Java process with JMX monitoring enabled, use the following options when starting JVM:
    • -Dcom.sun.management.jmxremote — enables JMX monitoring
    • -Dcom.sun.management.jmxremote.port=<port> — controls the port for JMX monitoring

      Note that if you're on a production system, you'll most likely want to secure your JVM before running it with these parameters. For that, you can specify these additional options:
    • com.sun.management.jmxremote.ssl
    • com.sun.management.jmxremote.authenticate
Once started, you can use JConsole or VisualVM to attach to the process. Note that later JDK 6 versions include VisualVM.

This is an example of the JConsole monitoring Tomcat. As shown in the example below, click on the Memory tab to get memory information:

memory_leaks_figure9

Again, we're interested in the long-term trends of heap memory usage, not trends from just a few minutes of running.

Finally, note that monitoring tools like Hyperic can typically show historical trends over longer periods of time than VisualVM or JConsole. In addition, tools like Hyperic allow for more fine-grained control over operations that are permitted by users. As a result, we recommend using monitoring tools for production systems use, while all the other tools discussed in this section are more appropriate for developers or "first aid" in the absence of the real monitoring tools.

A memory leak example….
Example 1:

memory_leaks_figure2

Now, suppose that the program holds reference to object A for a prolonged period of time. As a result, objects B, C, D, E, and F are all ineligible for garbage collection, and we have the following amount of memory leaking:
    • 100 bytes for object A
    • 500 bytes for objects B, C, D, E and F that are retained due to the retention of object A
So, holding reference to object A causes a memory leak of 600 bytes. The shallow heap of object A is 100 bytes (object A itself), and the retained heap of object A is 600 bytes.
Example 2:
memory_leaks_figure6

When no more memory is remaining, an OutOfMemoryError alert will be thrown and generate an exception like this:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at
MemoryLeakDemo.main(MemoryLeakDemo.java:14)
In the example above, we continue adding new elements to the list memoryLeakArea without ever removing them. In addition, we keep references to the memoryLeakArea, thereby preventing GC from collecting the list itself. So although there is GC available, it cannot help because we are still using memory. The more time passes the more memory we use, which in effect requires an infinite amount memory for this program to continue running.
This is an example of unbounded memory leak—the longer the program runs, the more memory it takes. So even if the memory size is increased, the application will still run out of memory at a later date.
Applying a fix

There are a few Tools for Dealing with Heap Dumps too that get the job done, better tools often provide a few extra helpful features:

    • Present a better summary of heap statistics.
    • Sort objects by retained heap. In other words, some tools can tell you the memory usage of an object and all other objects that are referenced by it, as well as list the objects referenced by other objects. This makes it much faster to diagnose the cause of a memory leak.
    • Function on machines that have less memory then the size of the heap dump. For example, they'll allow you to analyze a 16GB heap dump from your server on a machine with only 1GB of physical memory.
v  There are several free tools that are useful for analyzing heap dumps in Java. One that's widely used and is even included with the later versions of the JVM 6 is VisualVM. VisualVM is nice tool that gives you just enough to resolve memory leaks, and it shows heap dumps and relations between objects in graphical form.
v  Feature-wise, one step above VisualVM is the Eclipse Memory Analyzer Tool (MAT), a free tool that includes a lot of additional options. Although it's still in incubation phase as of publication of this article, MAT is free and we've found it to be extremely useful.
v  Commercial products like JProfiler, YourKit, and JProbe are also excellent tools for debugging memory leaks. These applications include a few options that go above and beyond VisualVM and MAT, but they're certainly not necessary to successfully debug memory leaks. Unless you already have a license for one of these commercial tools, we recommend trying MAT first.
Preventing memory leaks



Practical Problems faced by the people related to ML

Why do memory leaks sometimes take a long time to fix if the process is this simple? In our opinion, the main reasons are: 
    • There isn't a sufficient amount of easily available information about fixing Java memory leaks. We hope this article will help improve the situation.
    • It can be difficult to reliably reproduce an issue, and a lot of time is typically required to reproduce an issue before you can really start addressing it.
    • People lack the right tools for the job—in particular, memory profilers. With many free profilers available and reasonably low prices for commercial profilers, there's no reason you shouldn’t have good tools in your toolbox.
    • There are a few practical issues with the use of tools and techniques that might be perceived as road blocks the first time someone tries them. We'll address some of these issues below.
Fortunately, the practical issues most commonly encountered aren't very difficult to solve. The most common problems are:
    • You can't load the snapshot because you don't have enough memory in your development box. For example, this can easily happen if you have 64-bit servers with 8GB of memory allocated to the JVM. You might not have physical memory in your development box, or you might be using the 32-bit JVM on your development box, but the tool you're using insists on loading most of the snapshot into memory. If you like the idea of having a really powerful development box, use this situation to your advantage and ask your boss for a new machine. Alternatively, try using a tool with less of an appetite for memory, like Eclipse MAT.
    • You can’t increase your memory on the server to get a bigger set of leaked objects, yet you need a bigger set of leaked objects to speed up the process of finding the memory leak. This often happens with 32-bit JVMs. One option in this situation is to apply the techniques described above for finding slow memory leaks, although this approach requires a significant amount of time. Alternatively, you can try to reproduce problem on a 64-bit JVM, which will allow you to increase the memory. If the problem causing the memory leak is in your application, changing the JVM is extremely unlikely to "hide" the leak.
    • The memory leak is not in objects of just one class—you're leaking objects from thousands of classes, so it's difficult to find leaked objects. Or, your graph of object relations is so complex that you can find the initially-leaked objects but can't locate the cause. In either case, you've got quite a pickle on your hands. The only words of encouragement we can offer are that the situation will eventually improve, as the longer you track object references the better you'll get at it. One other piece of advice: you should probably call your significant other and let him or her know that you won't be home for dinner.
    • You're getting an OutOfMemoryError alert with a new, different format. For example, you might be looking at something like this:
      Exception in thread "pool-2-thread-1" java.lang.OutOfMemoryError: GC overhead limit exceeded.
      This message can happen with some GC settings that limit the overhead of the GC. It often indicates a memory leak, although there are some corner cases in which problems with GC performance can cause it. The best way to distinguish the cause of this message (is it a GC misconfiguration or a memory leak?) is to examine the pattern of errors. If they happen with regularity (in other words, they're a function of time the program is running), then it's a memory leak. If they happen only occasionally under heavy loads and clear later (in other words, they're not a function of running time), they might indicate a need to tweak the GC or increase available memory.
    • 3rd party caching systems can be setup so that caches are allowed to expand and fill up the available memory the program doesn't need. However, if the program no longer needs more memory, the cache automatically releases it. The point here is that in the absence of any OutOfMemoryError alerts, continually increasing memory usage does not necessary indicate a memory leak. Both "used memory grows as function of time" and "free memory eventually runs out" conditions are necessary in order to determine the presence of a memory leak.



No comments:

Post a Comment