Using LogParser to quantize Autonomy search logs

I explained in a previous post how you can use the Microsoft Log Parser to dice up Autonomy IDOL search logs.  If you’ve exhausted the typical checks for performance problems in your IDOL installation, it might help to narrow down when the problems occur, and look for cluster periods of slow performance.  That’s a great opportunity to use the Log Parser again.

The first step is to level the playing field on timing information; the content GRL and the DAH GRL show duration information in milliseconds mixed with seconds.  I’m sure there is a clever way to correct that in-stream, but I took the brute-force approach: create separate files from first the rows with seconds, then those with milliseconds, and finally produce a single file from the results.

logparser -i:xml -o:csv “select *, mul(to_real(extract_prefix([autn:duration], 0, ‘ s’)), 1000) as milliseconds into over_1_second.csv from http://server:port/?action=grl&format=xml&tail=10000 where [autn:duration] like ‘% s'”

Then the rows under 1 second:

logparser -i:xml -o:csv “select *, to_real(extract_prefix([autn:duration], 0, ‘ ms’)) as milliseconds into under_1_second.csv from http://server:port/?action=grl&format=xml&tail=10000 where [autn:duration] like ‘% ms'”

Then merge the two files into a single file:

logparser -i:csv -o:csv “select milliseconds, [autn:time], [autn:thread], [autn:status], [autn:action], [autn:request], [autn:client] into merged.csv from *.csv”

These three basic steps serve as the basis for most log analysis I do, so I’ve added them into a script.  The result, merge.csv is a flattened file that contains the data we need.  If you are going to script this, don’t forget to escape the percents, i.e. like ‘%% s’.

Next, we run a quant operation on the logs.  I’ve found that a half-hour period makes for a good range to view the average query performance:

logparser -i:csv -o:csv “select quantize(to_timestamp([autn:time], ‘dd MMM yy hh:mm:ss’), 1800) as period, avg(milliseconds) from merged.csv group by period order by period”

The 1800 constant there is seconds, i.e. half an hour.  The result is a list, here’s a short snippet:

Period Duration (ms)
2008-01-02 22:00:00 624.828913
2008-01-02 22:30:00 415.648974
2008-01-02 23:00:00 2410.331818

This report shows clearly that around 11pm, we see a sharp decline in performance.  You might also want to add a count(*) clause to the query to highlight the system activity.

4 Comments to “Using LogParser to quantize Autonomy search logs”

  1. Jon, Nice writeup on a solution I was hoping someone had tackled before. Thanks!

  2. Hey Brian, good to hear from you! And thanks for the feedback–let me know if you run into anything interesting.

  3. I usually don’t commonly post on many Blogs, yet I just has to say thank you… keep up the amazing work. Ok regrettably its time to get to school.

  4. You post prompted me to work on these queries for my own purposes.

    I came up with this which is a combination of your first two queries. I export it to a SQL database and create reports from that.

    Logparser -i:xml “select CASE [autn:duration] WHEN ‘%ms’ THEN to_real(extract_prefix([autn:duration], 0, ‘ms’)) ELSE mul(to_real(extract_prefix([autn:duration], 0, ‘s’)),1000) end as duration, to_timestamp([autn:time], ‘dd MMM yy hh:mm:ss’) as time, [autn:thread] as thread, [autn:status] as status, [autn:action] as action, [autn:request] as request, [autn:client] as client into dbo.Audit from http://hostname:port/?action=grl&format=xml&tail=10000 where status not like ‘in progress'”

    I use a trigger on the sql table that looks for unique time / thread pairs. This way I don’t insert duplicates.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

%d bloggers like this: