Wednesday, December 24, 2008

Funny Tech Video Part 3

Below is really gr8 video called kill dash nine. "Kill -9" is the command in unix to kill the process like Task Manager in Windows. You can find entire lyrics here. You can download mp3 version of it too from the same page. Some of the lines are really gr8 like

You're running csh and my shell is bash,

You're the tertiary storage; I'm the L1 cache.

You're a dialup connection; I'm a gigabit LAN.



Sunday, December 21, 2008

An Insane idea to Solve Java's Memory Problem

This could be insane idea to solve java's memory problem.

Garbage collection overhead has been one of the major performance problem. Recent improvements in Garbage collector : ability to parallel collect garbage using more processors or cores available, concurrent garbage collection has drastically reduced this belief that GC is mostly cause of performance problems. Earlier this problem could be solved by running multiple instances of application on the same box to use its all CPU power and memory. But for some memory-intensive application GC overhead is still a problem to be solved. As JVM is being touted as more of Virtual Machine for many interpreted languages as opposed to only Java runtime memory problem is bound to occur. I have been working recently on Terracotta which is Network Attached Memory (NAM) for Java Applications. When you say NAM, you application now can access more number of objects that could fit in your local heap since objects which are not in your local heap are transparently loaded into local JVM when they are accessed .. sort of lazy loading in hibernate. Now this idea can also be applied to JVM .. offload some objects which are not accessed often down to local disk and load them when they are not accessed.Imagine a situation where you are running application on 16 core or 32 core processors ( intel has one prototype of 80 cores) ..data processing ability of such hugh machine. With 64-bit platform JVM size can grow beyond 2 GB limit but when Full GC happens on JVM sized more than 4 GB its really painful for applications. I have never worked on such hugh JVMs no really no idea how jdk 5 and 6 performs in such situations. Sure there will be some optimizations in JVM to operate at such scale.

A simple prototype of this is various cache solutions which offer cache eviction to local disk when number of objects cross defined cache size. But these implementations are targeted as "caching solutions" which know only how to "get" and "put" objects in a manner so as to make best use of available space.

At first this idea may be insane but various optimizations can be done to make it practical. This is what terracotta has done, to avoid object serialization and operate at field level. Consider a map of 10000 fat objects. When this map is offloaded to disk all value objects will also be written to disk. Now when some value object is looked upon only the object can be loaded while rest of the objects still being stored on disk. This basically means you have to implement some sort of virtual memory manager for virtual machine which will use application access pattern and some stored intelligence to minimize the total load delay and at the same time allow application to operate on large data set.


Any comments are welcome .. I am sure there will be a lot of comments on this insane idea.

..
Tushar

Tuesday, December 16, 2008

Funny Tech Video Part 2

Hug a developer ....







Now this is one is reaaly funny and for the developers






What makes an engineer .. little dilbert video







Every bulid you break ...


Funny Tech Videos

Here are some really funny tech videos.

First one is my favorite : Matrix runs on windows. Listen carefully for some meaningful (real-world universal truth) lines like : "progress bar is moving but remaining time is going up". Its time to upgrade to Ubuntu.





Second one is music remix based on XP error sounds.





Third one is really creative geeky rap video from some "Sniper Twins".





Fourth one is from movie Office Space : Movie based on life of frustrated IT-Software Engineer, really describes life of some my friends : working on weekends, reporting to more than one boss. Somebody come up with that one, good, its old movie though 1996.



Saturday, December 13, 2008

Online TV : Reality for Indians

I recently moved to Noida for my new job and missing watching TV, especially Zee Marathi SA-RE-GA-MA-PA. And then very unthinkable happened : Shootout in the heart of mumbai - my beloved city. It was really hard to think whats going on in Mumbai, sitting in air-conditioned office in Noida. I could watch those boring and ever-repeating clips of news channel videos online but these are only dramatic and not informative at all. Then I thought what if I could watch TV online where I can get whole feel of whats happening, and yes by doing little bit of googling I managed to find out lot of channel which are available online. I could watch My favorite TV serial (one-and-only-one) online. Not only that I could see live news channel. Go to : http://www.ibollytv.com/iTV.php. Lot of regional channels are covered there. Internet connection speed required for these feed depends on the quality but 512 Kbps (that's around 64 KBps download speed) is sufficient for it. Sad thing is that, in India wireless internet is still hanging around 144 Kbps boundary so you can not avail this facility on move. On broadband front though thing look OK now, lot of ISP now provide 2 Mbps speed so things

One thing I want say is Thanks and I really appreciate to this to CNN-IBN. CNN-IBN (Leading news channel in India) has official free live feed. That's what you call free press. In events like Mumbai shootout this could really save your and your families day. Thanks once gain : Rajdeep Sardesai.


..
Tushar

Friday, December 12, 2008

Links : Kill Your Database with Terracotta

I am going to start a new series from this post. Many times you find great article on the internet that you would like to share with others. "Links" is one such series.


And very first link here is an Article written on Terracotta : An amazing technology for java application clustering. Its here : Kill Your Database. Terracotta is clustering technology with built-in support for HA. Clustering is all about shared data and when you talk about HA, you are actually caring for the shared data, its availability. Terracotta implements it by writing every change to shared data to disk in transactional manner. So if your data whose life is short or medium, (often you need to store derived data from the temporary data which is very large, and we use RDBMS for all of this) you can use terracotta to store it in very fast manner : since you don't have to convert data to relational model its all java. Pretty powerful!!. Other use case is Database off-loading which means you can use terracotta to store all data temporarily and then write to your database in manner which would not affect end-user response time. By doing this you remove database access and operation time from response time ( which mostly is the very significant part of the response time). Article mentioned above touches all these concepts. I will add one more link on terracotta org-site which describes this case ( and all other wonderful use-cases for terracotta). Here it is : Write Behind SOR. If you read my previous post, I had exactly discussed the same idea : Write Behind or Asynchronous writes. That time I knew only coherence supported this. Then I came to know that Gigaspaces also supports it. ( See comments below on the same post). But terracotta would be the most exciting among them all. Why? One reason is all existing application can easily integrate this pattern in there application with very small change in code-base. Second reason terracotta supports lot of java framework out of the box.

..
Tushar

Wednesday, November 5, 2008

Pal Pal Dil Ke Paas

OK this is supposed to be technical blog. but this post is not going to be technical, Lets take a break. I had too I had nasty accident on bike couple of weeks ago so all my normal work is affected. Often I listen songs new ones old ones especially Kishor Kumar's song. My favorite ones are Chookar Mere Man Ko and Pal Pal Dil Ke Passs. Come on! many people will agree to this. I was searching for exact lyrics of two songs what I found out that middle para, in Hindi its called "antara" is missing in nearly all lyrics sites so here is complete song below.




Pal Pal Dil Ke Paas Tum Rehti Ho
Jeevan Meethi Pyaas Yeh Kehti Ho
Pal Pal Dil Ke Paas Tum Rehti Ho

Har Shyam Aankhon Par
Tera Aanchal Lehraye
Har Raat Yaadon Ki
Baarat Le Aaye
Maein Saans Leta Hoon
Teri Khushboo Aati Hai
Ek Mehka Mehka Sa
Paigham Laati Hai
Meri Dil Ki Dhadkan Bhi
Tere Geet Gaati Hai
Pal Pal ...

kal tuzhko dekha tha
mene apne aangan me
jaise kahe rahi tu
muzhe bandh lo bandhan me
ye kaisa rishta he
ye kaise sapnye he
begane hokar bhi
kyu lagte apne he
me soch me rahta hu
dar dar ke kehta hu

Tum Sochogi Kyon Itna
Maein Tumse Pyaar Karoon
Tum Samjhogi Deewana
Maein Bhi Iqraar Karoon
Dewaanon Ki Yeh Baatein
Deewane Jaante Hain
Jalne Mei Kya Mazaa Hai
Parwanr Jaante Hain
Tum Yunhi Jalate Rehna
Aa Aakar Khwabon Mein
Pa L Pal ...



In india television channels, currently nearly all Channels have reality singing competitions going on. I follow one of this Marathi Sa-Re-Ga-Ma-Pa singing program. Its called little champs where children aged less than 14 yrs old sing all kind of marathi songs yes!! i said all kinds - Lavni, Koli Geete, Bharud, Abhanga, Natya-Sangeet. I too tried a bit in singing, recorded my favorite songs on my laptop. Believe me its really difficult. Really Hats off to those 10 years old. Still many more to come.

Here is one of the performances with comments from Shankar Mahadevan:


Saturday, September 13, 2008

Beware of Google!!

Just now i was chatting with my college friend in Marathi - one of many languages spoken in India. I had missed some of last sentences so so opened it again from "Chats" section in Gmail. Google shows clips and sponesred adds above the main area where mail is displayed. I was amazed to see Adds of site targeted to marathi speking community, to be specific : Matrimony site. How does google know that i am chatting in Marathi and Engilsh mixed and Matrimony site add is relevant to the item to be displaced. Hats of to google!

But at the same time i feel you should be careful about GMail and Google in specific. They are building database of people : your social network, your browsing habits and many other things. Now it has come-up with browser - Chrome. Minimalist one, but behind it Google can store lot of personal information on its servers. Google's mantra is "Don't Be Evil". But after certain limit its really intrusive. Google basically uses all this information for advertising thats there bread-n-butter. But who knows what will they sell to the advertisers. So Beware of Google!

Here is post on HackerNews Network that talks about Dependence on Google
http://news.ycombinator.com/item?id=350968

Saturday, September 6, 2008

Building Performance Monitoring Solution with Nagios and NDOUtils Part 2 PerfNagios

Welcome to part 2 of this series. This took long time but was worth the effort. In this part i am going to describe how to save performance data and integrate a sample monitoring scripts for performance monitoring. In earlier post I wrote about how to install and configure NDOUtils for database support in nagios. We will be using the same setup for storing data but will add some more table to store performance repository. Remember NDOUtils will delete data periodically!.

Lets first discuss how would you save performance data.

PerfNagios

is sourceforge project started by me for implementing this idea. Currently only parsing code is stable, reporting and dashboard functionalities will be added later. Stay connected on this blog for future updates to this project.

PerfNagios is basically small web-interface for displaying nagios related data easily. Eventually it will have good reporting capabilities. Currently it only shows performance data for last 1000 readings. You can see screenshots below.




Following are tables i used for storing performance data.


CREATE TABLE `nagios_metrics` (
`metric_id` int(11) NOT NULL auto_increment,
`instance_id` smallint(6) NOT NULL ,
`host_object_id` smallint(6) NOT NULL ,
`service_object_id` smallint(6) ,
`unit` varchar(60),
`label` varchar(255),
PRIMARY KEY (`metric_id`)
) ENGINE=InnoDB;


CREATE TABLE `nagios_metric_data` (
`metric_data_id` int(11) NOT NULL auto_increment,
`metric_id` int(11) NOT NULL,
`value` double not null,
`warn` double,
`critical` double,
`min` double,
`max` double,
`date` datetime not null,
PRIMARY KEY (`metric_data_id`, `metric_id`)
) ENGINE=InnoDB;


CREATE TABLE `nagios_perf_batches` (
`date` datetime not null,
`last_service_check_id` int(11) not null,
`last_host_check_id` int(11) not null,
`host_checks` int(11) not null,
`service_check` int(11) not null
) ENGINE=InnoDB;


Lets take an example say we need to keep track of how cpu is getting used. For that we need to add service call CPU to monitoring host. Below is small script which outputs important cpu performance metrics : CPU Run length, User, System and Wait-on I/O. Output is like :

CPU : OK 0 | rl=0;2;5 us=2;85;85 sys=0;10;20 wa=0;5;10 total=3;80;90


  • check_java.sh

value=`vmstat 3 3 | awk -f /opt/nagios/libexec/ubuntu-sys-plugins/cpu.awk`
returnvalue=`echo $value | awk '{print $4}'`
echo $value;
exit $returnvalue;


  • and cpu.awk


{
#print $1,$13,$14,$15,$16;
if(NR>=5)
{
rl=rl + $1;
us= us+ $13;
sys = sys + $14;
wa = wa + $16;
}
#print NR,"--->", $0;
}
END {

rl = rl /2;
us= us/2;
sys = sys/2;
wa = wa /2;
total = us + sys + wa;

#print "Total is ", total;

if(total <= 75) { msg = "CPU : OK"; returnvalue=0; } if(toal > 75 && total <= 85) { msg = "CPU : WARINING "; returnvalue = 1; } else if(total > 85)
{
msg = "CPU : CRITICAL";
returnvalue = 2;
}
msg = sprintf("%s %d | rl=%d;2;5 us=%d;85;85 sys=%d;10;20 wa=%d;5;10 total=%d;80;90",msg,returnvalue,rl,us,sys,wa,total);
print msg;
}


  • To include this monitoring script you need to add service for localhost. In nagios 3.x configuration is based on templetes. For defining we need to first add check command in file /opt/nagios/etc/objects/commands.cfg


# 'check_cpu command definition - tushar
define command{
command_name check_cpux
command_line /opt/nagios/libexec/ubuntu-sys-plugins/check_cpu.sh
}

  • Once this is done you need to add service definition in /opt/nagios/etc/objects/localhost.cfg.

define service{
use local-service ; Name of service template to use
host_name localhost
service_description CPU
check_command check_cpux
notifications_enabled 0
}






Restart the Nagios by /etc/init.d/nagios restart and done! You system is now able to monitor cpu information.

Below is the graph drawn in PerfNagios for the same script.





JAMon Data for Java Applications

Recently i used JAMon for gathering important metric in one application which was not j2ee application. For JDK based application how do you get information in absence of JAMon Web Interface. I asked about it to Steve Souza who imeedaitedly returned with answer : MonitorFactory.getData() and MonitorFactory.getHeader(). Based on this data jsmons.jsp is written. For JDK based application i thought we can give same information via JMX. Below is small wrapeer to do

  • Register JAmon Factory as MBean and implement some key functions like Attributes and Operations


JAMonMBean mbean = new JAMon();
ObjectName name = new ObjectName("jamon.perf:type=JAMonBean");
MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
mbs.registerMBean(mbean, name);

  • JAMon Client


ObjectName name = new ObjectName("jamon.perf:type=JAMonBean");
MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();
String arr[] = {key};
String[] signature = new String[] { "java.lang.String" };
obj = mbs.invoke(name,operationName, arr,signature );


  • Start background thread to store information in file


Object[][] data= client.getData();
// record format date : label : Hits : Avg : Total : StdDev
String timestamp = "" + System.currentTimeMillis();
if(data!=null)
{
for(int i=0;i<data.length;i++)
{
StringBuilder sb = new StringBuilder();
sb.append(timestamp).append(" : ");
sb.append(data[i][0]).append(" : ");
sb.append(data[i][1]).append(" : ");
sb.append(data[i][2]).append(" : ");
sb.append(data[i][3]).append(" : ");
sb.append(data[i][4]);
pw.println(sb.toString());
pw.flush();
//System.out.println(sb.toString());
}
}


Equipped with above, I have also written a small utility which outputs information of all counters periodically in the format

timestamp : label : avg. response time : hits : std. deviation

I have uploaded code used to demonstrate the JAMon Bean have been uploaded at http://tushar.khairnar.googlepages.com/perf-sample.zip.

Please see Sample Program for the same.

Sunday, August 24, 2008

Building Performance Monitoring Solution with Nagios and NDOUtils

In my previous post i talked about Application Performance Management - APM tools and discussed possible requirements of it. I also talked about Nagios being wonderful and proven monitoring and solution. In this post i will show how to build monitoring solution with Nagios comparable to any commercial monitoring system.

We will be using following components along with Nagios.

  • NDOUtils - Storing runtime information to database

  • NRPE - Remote Plugin Agent

  • NSCA - Remote Plugin with Passive Checks

  • Nagvis - Visualization Addon

  • Nagios Business Process Addons - Custom Bird's Eye View of Business Application

  • NSClient++ - For monitoring Windows Hosts

Tough part is that these are all discrete components requires good knowledge. This series will focus on all these system and build good monitoring solution. This part will be focusing on NDOUtils and performance data. Once complete I will release the package with installation script so that it will be easy for installation.

I assume you have nagios installed and working. If not visit quick-start guides Ubuntu, Fedora, OpenSUSE. They work perfectly. Go through nagios console and explore what nagios provide out of the box.

Lets first see how to setup nagios and NDOUtils. NDOUtils is an implementation of Nagios Event Broker module(NEB). NEB has shared object which is loaded by nagios and can register for listening events. NDOUtils passes this event to file or socket. Second part of NDOUtils is C-daemon which saves the information to database.

Download ndoutils tar from nagios site. When you try to compile it you may face problem with mysql library. On Ubuntu first install mysql-dev library. Off course you have to have mysql server installed first.

sudo apt-get install libmysqlclient15-dev


Start mysql client and login as root. Create Database and nagios user.

create database nagios;
use nagios;
grant all on nagios.* to 'nagios'@'' identified by 'nagios'
grant all on nagios.* to 'nagios'@'localhost' identified by 'nagios'

Go to db directory and create necessary tables.

./installdb -u nagios -p nagios -h localhost -d nagios


Now go ndoutil deirctory and compile the source

./configure --with-mysql-lib=/usr/lib/mysql


Copy binaries to nagios. This assumes that you have installed nagios in /opt/nagios (Defualt is /usr/local/nagios)

cp ndo2db-3x /opt/nagios/bin
cp ndomod-3x.o /opt/nagios/bin

Also copy config files - ndomod.cfg and ndo2db.cfg from config directory to /opt/nagios/etc
Modify Following lines.

ndomod.cfg
output_type=tcpsocket
output=127.0.0.1
buffer_file=/opt/nagios/var/ndomod.tmp

ndo2db.cfg
socket_type=tcp
db_name=nagios
db_user=nagios
db_pass=nagios

Now open nagios.cfg file and modify following keys

event_broker_options=-1
broker_module=/opt/nagios/bin/ndomod-3x.o config_file=/opt/nagios/etc/ndomod.cfg

Now start ndo2db daemon before restarting nagios

/opt/nagios/bin/ndo2db-3x -c /opt/nagios/etc/ndo2db.cfg

Restart Nagios

/opt/nagios/bin/nagios -d /opt/nagios/etc/nagios.cfg

Now start mysql client and check nagios_hosts and nagios_services table to see that data is getting saved.

You can put following lines /etc/rc.local so that all events fired by Nagios ( which is started from /etc/init.d/nagios script) are picked up and processed. This makes everything started when machine boots up.

/opt/nagios/bin/ndo2db-3x -c /opt/nagios/etc/ndo2db.cfg


Once this is done all your data is saved to database. Now we can build our Performance Parser on top of this. I will come-up with shell or perl script for all installation once complete.

..
Tushar

Tuesday, August 19, 2008

Java Performance : Monitoring and Measurements - APM

Recently I came across this (Run-time performance and availability monitoring for Java systems) wonderful series of articles about implementing run-time performance monitoring for application's ecosystem sometimes its called APM (Application Performance Management). This series really good piece of information every person who needs to implement some kind of ecosystem for performance needs to read it. But all screen shots resemble Wily Introscope so basically it describes everything what Introscope does.
Article say Helios is the reference implementation of idea discussed.

It covers

  • Monitoring fundamentals - Why you need it. What you need it - Periodic reports based on template and custom reports, Historical storage and analysis, Live Visualization and simultaneous plotting for correlation, Alerting : based on email, blackberry, sms or JMS with GUI, Dashboards

  • Recent advancements - like Agentless monitoring, Synthetic Transaction

  • Some very high-level design details - Performance Data Source, Collector, Tracers

  • Tracing patterns : Polling, Listening, Interception , Instrumentation


And about how you go about performance monitoring in general?

It focuses JMX as primary way of doing things but if you look towards commercial APM like CA Wily's Introscope these are not JMX based. I feel if your application is java centric then it helps to have java based apm : Introscope Wily.

I would like to add following too.

Synthetic transactions : I feel synthetic transactions are really helpful but at the same time difficult to implement. Tools like Grinder can help you to implemented these.

Application Specific Metrics : Generally you have infrastructure metrics like os metrics, storage metrics, network and application infrastructure metrics like : App Server metrics (queue length, response time) and Database : avg. query time. But sometime you may require to gather application/business specific metrics. An example would be say : policies issued per day. Framework should be flexible to allow such metric to be posted to APM where it can then be correlated to technical metrics.

Dignosis : Also APM should be able to switch gears when problem occurs. When problem occurs APM should start collecting data at more granular level so as to get more refiend picture of system. Introscope has dignosis tool called : Transaction Monitor which traces entire transaction as one context. Article though talks about "heavy instrumentation" as i guess there

Heuristics: APM should also provide some level of analysis (rule based) like what Glassbox does. See demo http://demo.glassbox.com/glassbox

Dilemma with APM systems is that commercial products like IBM Tivoli, HP OpenView, Mercury BAC come with many features with hefty cost. So open-source comes to rescue here : There lot of Open-source projects which implement this idea. I am big fan of Nagios. Others like Groundwork Monitor implement extra functionality around nagios. Nagios 3.0 has made lot of progress and now "installable" for normal user with this guide. http://nagios.sourceforge.net/docs/3_0/quickstart.html
Also, if your application is java centric then it makes good seance to have JMX based system as described in this article. Sometimes you don't need full blown system. In such situation tools like JAMon API can help you. Do visit JAMon site it will surely help you even its recommended to keep running on production environment.

If you are interested in java profiling and how its done read : Build your own profiling tool and Jensor (jensor.sourceforge.net). Jensor is java profiler built by TCS's Performance Engg Group ( from where i started my career) is focused on first article mentioned but has good analysis gui which helps you to dig in. Commercial profilers JProbe and JProfiler are really good.

In past I had experimented same idea with system called Nagios. There focus was to implement performance monitoring system for the entire lab. We had written Nagios plugins for Webspere(PMI based) and weblogic ( weblogic shell) for collecting performance metrics. Nagios but, is only scheduling and executing engine which does not effectively care about the data ( Currently with 3.0 version it has lot of extension points where you can save events and performance data in mysql or postgres database). So we had used small open-source system Perfparse for performance data and pulling out reports. In our lab we had implemented this on about 80 servers on single CPU monitoring machine running RHEL 4.0. Worked very well for me.

Best one is the one which solved your problem!!.

..
Tushar

Friday, August 15, 2008

Meet the People Who Have Trillions Riding on Linux this Fall

Hi,

Recently i cam across this link : Meet the People Who Have Trillions Riding on Linux this Fall. This shows how deep linux has penetrated. Linux is handling trillions of money. Wow!

Wednesday, August 13, 2008

Java Performance - Multi Core Processors

Hi i was reading book called Java Concurrency in Practice by Brian Goetz and others.
I found this interesting quote :


"For the past 30 years, computer performance has been driven by Moore's Law; from now on, it will be driven by Amdahl's Law. Writing code that effectively exploits multiple processors can be very challenging."


Amdahl's law describes how much a program can theoretically be sped up by additional computing resources, based on the proportion of parallelizable and serial components.

util.concurrent from Doug Lea brought Java Concurrency API right into JDK. From starting java was the first (at first among mainstram prograaming languages)
language to support Multithreading. With JAVA 5 any developer can write safe and scalable Java programs.


All latest processors now are multi-core processors this clearly shows shift towards parallel systems where your program has to effectively use all hardware thread underlying cpu provides. Unless programs are completely multithreaded, they simply won’t use the power available in hugely multicore systems. Lot of attention is given to transform code to multi-thread code. Read this post for details : "Intel: We Can Transform Single Thread to Multithread"

I also got to read this wonderful post which discusses Java Performance mainly GC on multicore processors : Multicore may be bad for Java.

Indeed Java is Multicore Ready I believe .. Happy Multithreading.

..
Tushar

Sunday, August 10, 2008

My New Laptop - Acer 2920


Hi, weeks before I bought new laptop for me Acer 2920. Earlier I had blogged about my journey with linux earlier. Linux was mandatory criteria for my laptop. So i had experimented a lot with different laptop reviewed them on internet. My brother's Laptop : Sony VAIO CR12GB/H had responded well to linux call ... it was so good that my brother is now linux convert uses Ubuntu prefarably (See snapshot Below).


Other criteria being : Windows XP. I feel Vista is a big flop specially for countries like india where hardware here is atleast 6-7 months behind latest technology. In very first post on this blog I mentioned my disappointment in following words

"And last thing i want to say about VISTA is that its fooling people around. It takes seconds to load dekstop and pretend ready for action but reality is different. It take much more time to get all networking started and then i can start my browser. With linux it takes time to login but I know when its now ready - ready for anything file explorer, browser and mp3. "



But the market here in India is not that good. You don't get Windows XP laptop with good hardware these days. So I had to go cost mode .. laptops which are targeted as utility laptops : I settled for Lenovo and then i saw this beauty in Vijay Sales Thane Shop. Sales Guy was really friendly and allowed to boot the laptop with ubuntu live CD. It passed the test. Acer 2920 is the smallest laptop around with no compromise on hardware or any necessary features. It comes with driver CD for both Vista and XP which is not common in latest laptops which restrict you to vista only.






so if you are looking for good utility laptop : Acer is good brand and certainly best one for linux laptop. I recommend Sony also : Sony is also good value for money. Sony laptops have good sound quality : loud and clear, clear crisp bright display. VAIO CR series certainly suited for the most.

..
Tushar

Saturday, August 9, 2008

Java Performance : Caching Clustering ... and "FlushCache"

Hi i came back after long time. This time again with java. Recenly playing with lot of java solutions relating to scaling and clustering. Lot of places java high performacne systems you see buzzwords like multi-threading, clustering, caching, map-reduce, partitioning, grid solutions. Lets discuss some of them.

I have been using hibernate for last 1.5 year in my personal experiments and at work place too. In one of project we had cache requirement for master data which was read-only. There we had implemented caching by hand as follows.

1. Wrapper around DAOs which first check in cache and then go to database.
2. Cache was implemented with Websphere provided object pooling API.
3. Websphere provided object pooling mechanism which even works in clustered set-up or network deployment with cache replication strategies.

Our requirement was well met and caching brought lot of improvement in response time as expected.

After caching, horizontal scaling or clustering comes into my mind when i think about performance.
Since then, I discovered lot of caching/clustering projects/apis, some of which are very much i feel worth to take a note.

1. Tangosol - recently acquired by Oracle, along with oracle, timesten ( oracle in-memory database, another acquisition :-) ) forms formidable data-tier. Tangasol is very proven prodcut which provides lot of features like distributed cache, data partitioning. Cameron Purdy on Theserverside claims to reach upto 0.5 milllion cache transaction per second.

2. Terracotta :- Terracotta is very revolutionary java product in fact listed in top 10 java things of year. Terracotta actually speaking is not caching solution but a clustering solution. Its basically JVM level clustering. They have provided cache solutions for lot of commons requirements : Hibernate, HTTP Session, Spring Beans etc. They provide good extension for lot of open-source projects. I had tested tomcat clustering and terracotta, similar to one on terracotta site and it gives good performance boost. Among all other clustering solutions terracotta has simplest programming model : NO PROGRAMMING MODEL. right : objects basically clustered at jvm level so only configuration change no code change. Practically you do require to change java code but that's minimal All java semantics work well in terracotta cluster. Terracotta is very useful in patterns like Master/Worker or bunch processes looking for some coordination or data sharing. Master Worker pattern term i guess was introduced much before Googles Map/Reduce and very much the same idea.


3. Memcached : this one is in c but has client libraries in almost all major programming languages. Idea of memcached is actually little bit different. You keep bunch of memcached process running objects are stored on these processes with hash key. Memcached works best when its processes run on web servers which are cpu-intensive with lot of memory available.


4. New bunch of grind frameworks : gigaspaces, hadoop - java map-reduce implementation

But now i had come across very different requirement where cache modification (writes) were significant in numbers and jdbc-batching with asynchronous operation was key to performance sometime this is called - write behind. When i looked upon only tangosol claims to have such facility so i decided to implement by hand. Here is an idea.

1. A background threads basically monitors queue.
2. All cache write operations append modified objects to this queue.
3. queue periodically flushes them to database and notifies successful operation to clients.


Here catch is how do you handle object graph updates
1. Delta Calculation
2. New Object insertion may require updates in one specific order where result of first inserts basically required for next one. (foreign key)

Initially i went with hibernate where i used StatelessSession as SQL generation engine hoping that hibernate will calculate delta properly. But lack of L1 cache means no life cycle hence no delta monitoring. So i had two choices.

1. Use merge : Fires extra selects
2. Generate sql by hand.

With help of cglib proxies i managed to get list of dirty columns but then how do you handle object relationships?

I soon realized whole ORM needs to be implemented... here list of requirements for basic ORM cum cache with write-behind

1. easy configuration : declarative orm mapping sufficient for most of scenarios
2. Should provide different execution strategies timer driven, threadpool driven, batch reached, batch and timer combined, entity level strategy for sync
3. in-memory multi-indexing based on columns or own implementation : cache object can be queried with different criteria
4. notification - listeners
5. target rate with % of write operation : 500 requests/second.
5. recovery test : check points : automated and manual
6. clustering with help of terracotta.

This is exhaustive list .. where first one itself is very big one..... my take on it is to restrict the scope and focus on caching instead of fancy orm stuff the one like hibernate.... it should be able to support only object graph ... all lifecycle is left to user..

I will add code snippets in coming posts ... now only idea has been finalized.

Sunday, May 25, 2008

My Linux Journey

My Linux Journey

My linux journey started in my engineering days when i got 3 cds of redhat 9.0. I did not have internet connection and only source was friends and computer magazines. I was very much impressed with it first i installed it but then i did not have anything to do with it i had Photoshop, Flash, Dreamweaver and Frontpage in windows and Age Of Empires too!!. I installed software called System Commander which was sort of protector for my system. At that time i did not have any guts to play with GRUB. And no live CDs for recovery. It is boot loader and system software. One time i had 6-7 different OS on my system - Windows 98/NT (don't exactly remember but i am proud of the fact i had worked on all version of windows - 3.1,95,98,ME,NT4,XP,2000,2003 and now VISTA, funny thing is that all of them have crashed on my system - blue screen of death. Most delicate was ME and NT,2003 were rock solid XP being most practical), Windows XP, Red hat 9, Damn small linux, Minix, some linux distro that used to fit on floppy i had installed it on my harddisk thanks to System Commander. I used test new versions and flavor of linux Just for fun.

By the way Just For Fun is autobiography of Linux Torvalds - very funny book if you are geek/programmer you will definitely like it. i did.

But then after lot of crashes, partition formatting/reformatting,i settled on Windows XP which worked without any hassles. Occasionally i used to boot into linux again just for fun.

After i got my first salary i started upgrading my pc. I bought new 80GB harddisk so that i could have more space to store data for windows. I reserved 20 GB for linux and tried a new (for me till that time i was familiar with only redhat linux) and hot distro of that time - UBUNTU. I liked it so much that i replaced my fedora with ubuntu moving all my data to it. But i still thought XP is better than ubuntu.

For last 6 months my brother's laptop has been sort of laboratory for me. My old (infact very old) PIII 800Mhz box can not take it any more. I was very much excited to idea of virtualization. Till 6 months before i was not confident enough on linux desktop so i was using Windows XP and with virtualization i could run ubuntu right within windows - best of two worlds. I tried ubuntu 7.04, 7.10 8.04, Kububtu 7.10, gOs2 beta, Oracle Linux, and Fedora.

I have been successful in attempts to convert my borther in to a linux follower - thanks to the buggy and resource hungry Windows Vista. Who want that eye candy when you get similar( though not that good though) graphics effects on linux without any performance loss.

After lauch of Ubuntu 8.0 beta i thought i will give it a try for desktop linux. I used ubuntu for over a week without booting linux on my old box PIII 800Mhz 256MB RAM. Whenver i required any application i went to Add/Remove did a simple search and i could find and use ( without any crash - earlier also i had used linux but most of the time something used to crash) equivalent linux application. Only reason i had to boot into windows being to check my office applications - Site works only on Internet Explorer. I never tried Wine yet now i
need it.

Most suprising thing is that Ubuntu 8.0 worked out of the box on Sony VIAO CR12GH/B laptop. Ubuntu 7.10 has some difficulty recognising intel x3100 graphics accelerator hence screen resolution was limited to 800x600 ( i tweaked Xorg conf and it worked with resolution 1200x800 on ubuntu 7.10 as well).

Earlier whenever i worked on linux i knew something would go wrong and i accepted it now i dont have that feeling - Ubuntu ROCKS.

Till that moment i used to think linux is just not ready for desktop for normal user( i am half normal and half geek - i am passionate about linux but have grown in comforts of windows) ( there are plenty reasons for it - h/w is one major reason i feel). But last 2 years i have seen great growth in linux distros mainly Fedora, Suse and Ubuntu.

If you have good standard hardware on your box linux is READY for you and certainly is for Me and it will beat windows vista in user experience. Major reason for this fact is buggy and bulky VISTA. i get rating of only 3.0 out of 5.0 with very strong hardware for laptop. What the heck what do you want exactly VISTA? 5 GHz processor. 2 GB graphics card????

Here is my list:

Rythmbox
Window Media Player/Winamp - Windows media player takes around 50% CPU when playing high beat rate (more than 150 Kbps) mp3 Winamp and rythmbox take less cpu.

Totem/VLC
Movie player I love VLC. Its best player avaialable. Windows Media player really sucks here. I just cannot run any movie in my p3 box with version 10 onwards. VLC is much better both on windows and linux. Whats better than having your favoiurite application avaialable on both systems. Again windows application sucks here.

Bit-torrent
Microtorrent is very good but so is azureus and Transmission. No problems here too.

Browser
Firefox 3.0 rocks and blazingly fast on sony vaio laptop. I can not get same performance on Vista/IE or Vista/Firefox combination. Most developers say Firefox 3.0 is the FASTEST browser to date. I did not get any improvment on my p3 box (windows xp). But i could feel firefox more responsive on linux on both systems - XP(PIII) as well as Vista(Core 2 Duo T7100). Why is so?. I really dont know neither i don't believe in benchmark numbers ( i have done performance testing/benchmarking i know how misleading sometime they could be). As performance engineer i used to give importance on fact how end-users are experincing it. Did they find any improvement or felt good. With firefox 3.0 on linux i felt. On other note IE7 is not that bad too. But i am firefox convert. long time ago.

Programming
Editplus beats any lightweight editor in linux. I have not tried anything other that GEdit. Gedit is little bit heavy ( on P3). But i could find editor called Geany i liked it very much. I used Ecliplse for most of my Java programming and was disappointed a bit on linux. On thing about IDE i care is how it uses screen effectively to pack useful stuff. On linux SWT occupies much more space than on windows.

Office
Nothing to say. OpenOffice is really behind MS Office. but i have different opinion here. Dominance of MS Office is just because of numbers. What would you do if you have to write some document and mail it to somebody. I could find one alternative in OpenOffice, Convert that document to PDF and mail it. I just hate rebooting again to windows and edit that doc.

For other applications you can refer to this interesting links :
And last thing i want to say about VISTA is that its fooling people around. It takes seconds to load dekstop and pretend ready for action but reality is
different. It take much more time to get all networking started and then i can start my browser. With linux it takes time to login but i know when its
now ready - ready for anything file explorer, browser and mp3. In and all if you don't have any reason to stick to windows like you IDE works only in
windows then go and install ubuntu (or any linux) give it a try. Vista is just not ready for countries like india where latest hardware comes to
mainstream atleast 6-8 months after. Concepts like open source, linux, ubuntu, free document formats (openoffice), OLAP (one laptop per child) are very much suitable in developing countries. This will make sure that user gets what it need and hes not forced to buy first windows then Microsoft office and
then anti-virus then internet protection software to protect windows. its chain that's not needed and can certainly be avoided.

More or less computing is for fun as said by Linus Torvalds in his thoughts on life... so enjoy it. I certainly enjoyed it with linux.

Oracle on Ubuntu Hardy Haron

I had hard time installing oracle on my brother's sony vaio laptop. Not because of linux but because of windows i was out of space on ubuntu and wanted to shrink windows partition. i had strage expeprice with it. Windows does not let me to shring below 1 GB. Why? may be some technical difficuly. huh! then i started with ubuntu 8.0 live cd, opened Gparted i could resize my NTFS partition. Wow! why windows wouldnt allow then? Does it know that i had ubuntu installed on this laptop and user of this laptop dosent prefer windows. Only reson that i went to windows to resize is because vista is very particular of stratup files and my exprience with NTFS on linux is bad. lost lot of data once.
Within 2 hour gparted moved my 50 gb partition. freed 16 gb for linux and 2gb for swap.

Very first thing i noticed is that oracle doesnt support ubuntu as officially supported linux distribution but i could manange to find installation guide for debian. Its recommended to go with RHEL/SUSE Enterprise if you are using for commercial purpose.

Here are some of links for pre-setup tasks :
  • http://linux.togaware.com/survivor/Oracle_10g.html
  • http://www.akshaymehta.com/2006/12/10/installing-oracle-10g-r2-on-ubuntu-edgy/
  • http://rossov.com/2006/02/08/oracle-10g-ubuntu-linux-vmware-part-ii/
  • http://www.oracle.com/technology/pub/articles/smiley_10gdb_install.html

Even oracle's Installation Guide is also helpful.

Follow all the steps correctly. This actually helps you in case you come across some problem, Since you know what is installed where?
So here we go start terminal with oracle users

su oracle
./runInstaller -ignoreSysPrereqs

-ignoreSysPrereqs is flag that allows oracle to proceed without check. Alternative is to create file /etc/release and pretend that i am rhel.

Installlation went more or less smooth but i faced some problem which are faced by many people. I has faced similar problems on RHEL too!. Little bit of googling will solve most of them.

  • Linking phase fails for 10gR2 on Ubuntu 6.06: undefined ref to 'nnfyboot'
Solution : Create following sysmlinks and relink
ln -s $ORACLE_HOME/lib/libclient10.a $ORACLE_HOME/lib/libagtsh.a
$ORACLE_HOME/bin/genagtsh $ORACLE_HOME/lib/libagtsh.so 1.0

This problme is discussed here and worked on ubuntu 8 too:

  • Right version libstc++ library on my ubuntu 8.04 version was 6.x. So i created following symlink

libstdc++.so.6 -> libstdc++.so.1 ( Present)
libstdc++.so.5 -> libstdc++.so.1 ( Oracle tries to find it, so i added)


  • "ORA-12547: TNS:lost contact" when creating database in Oracle 10g. Lib AIO Error This is very annoying.After installation oracle starts configuration assistant DBCA. After googling a bit around i found out libaio was required. In ubunntu it is called libaio1.


So do apt-get install libaio1


This problm is discussed in this thread for Fedora.



Most of the problems were in linking phase hence i did $ORACLE_HOME/bin/relink all again to make sure that all changes have been affected.

Now oracle is working fine on ubuntu 8.04 beta.

How to integrate DWR and Struts

DWR is AJAX framework for JAVA. Using DWR you can call directly any method of java class asynchronously ie. without entire page reload. With following you can even integrate DWR with struts so as to have access to important formbeans and also any other java classes. Only restriction being : methods to be called has to be form methods. This is easily done by refactoring code in Action class to form class. By this way you can call this code from Action class as well as via ajax.

Add dwr.jar to web project

Modify Web.xml Add following code to web.xml. Make sure DWR servlet gets loaded after ActionServlet.


<servlet>
<servlet-name>dwr-invoker</servlet-name>
<servlet-class>org.directwebremoting.servlet.DwrServlet</servlet-class>
<init-param>
<param-name>debug</param-name>
<param-value>true</param-value>
</init-param>
<init-param>
<param-name>activeReverseAjaxEnabled</param-name>
<param-value>true</param-value>
</init-param>
<init-param>
<param-name>initApplicationScopeCreatorsAtStartup</param-name>
<param-value>true</param-value>
</init-param>
<init-param>
<param-name>maxWaitAfterWrite</param-name>
<param-value>100</param-value>
</init-param>
<!--
<init-param>
<param-name>org.directwebremoting.extend.ServerLoadMonitor</param-name>
<param-value>org.directwebremoting.impl.PollingServerLoadMonitor</param-value>
</init-param>
-->
<load-on-startup>2</load-on-startup>
</servlet>

<servlet-mapping>
<servlet-name>dwr-invoker</servlet-name>
<url-pattern>/dwr/*</url-pattern>
</servlet-mapping>



Create dwr.xml file. This file will tell DWR which classes are to be exposed for asynchronous calls. Following example file exposes PersonFrom class using interface name Forms using Struts Creator. It also exposes Date class as remote class JDate.


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE dwr PUBLIC "-//GetAhead Limited//DTD Direct Web Remoting 2.0//EN" "http://getahead.org/dwr//dwr20.dtd">
<dwr>
<allow>
<create creator="new" javascript="JDate">
<param name="class" value="java.util.Date" />
</create>
<create creator="struts" javascript="forms">
<param name="formBean" value="personForm" />
</create>
<convert converter="bean" match="$Proxy*" />
</allow>
</dwr>


Include following lines in JSP for adding DWR javascripts. “forms.js” refers to our “forms” class exposed in dwr.xml. These files do not exist physically but are dynamically generated and served by DWR servlet.

<script type='text/javascript' src='dwr/engine.js'> </script>
<script type='text/javascript' src='dwr/util.js'> </script>
<script type='text/javascript' src='dwr/interface/forms.js'> </script>


Now you can call any method of form PersonForm class. PersonForm class has method called generateAntiSpamMailto. You can directly call this method using javascript. See following example. Following example basically dynamically calls “generateAntiSpamMailto” method on server to get anti-spam text and puts in to the div and makes the div visible. This is done by defining function as the third argument. First two arguments are regular arguments which are same as java arguments.


function process() {
var address = dwr.util.getValue("address");
var name = dwr.util.getValue("name");
alert('Addres is ' + address);
alert('Name is ' + name);

forms.generateAntiSpamMailto(name, address, function(contents) {
alert('content is ' + contents);
dwr.util.setValue("outputFull", contents, { escapeHtml:false });
dwr.util.byId("output").style.display = "block";
});
}

and HTML part as


<input id="submit" type="button" value="Submit Query" onclick="process()"/>
<html:submit>Submit Query</html:submit>
</html:form>

<div id="output" style="display:none;">
<h2>Generated Links</h2>
<textarea id="outputFull" rows="9" cols="70">
</textarea>
</div>


You can even use DWR to validate certain field on the spot when user finishes typing the data. Validation will be done on the server but called asynchronously via AJAX. Following is an example which makes use of Apache Commons Validator framework to validate an email address. This require certain jars to added to lib(BSF, BSH,Commons jars included in the zip file attached).

DWR xml

<create creator="script" javascript="EmailValidator" scope="application">
<param name="language" value="beanshell"/>
<param name="script">
import org.apache.commons.validator.EmailValidator;
return EmailValidator.getInstance();
</param>
</create>

and HTML /JavaScript part

<script>
function verifyAddress() {
var address = dwr.util.getValue("address");
EmailValidator.isValid(address, function(valid) {
dwr.util.setValue("addressError", valid ? "" : "Please enter a valid email address");
});
}
</script>

<html:text property="name" size="16" onkeypress="dwr.util.onReturn(event, process)" onblur="verifyAddress()"/>