Archive for January, 2008

January 17, 2008

How to save money on Microsoft Office 2008 for Mac

I haven’t found a site to download a demo of Office 2008, and I can’t seem to get answers from anyone at MacBU or in blogs, so I decided to buy it. I’m really interested in seeing what the development opportunities are (if there are any), and I wanted to see what the new Entourage had to offer. The retail price of $399.95 seemed a bit steep: I’m looking for development options, not to use it full time (not yet at least). The upgrade price of $239.95 is far more palatable, and sure enough, the upgrade policy (from the bottom of the retail box, or here at the “upgrade eligibility” link) is:

The software will install only if you are a licensed user of one of the following products: … Any Microsoft Office for Mac 2001-2004 suite or application.

Well guess what. I’ve got Entourage 2004–it came free as part of my 1&1.com hosted Exchange account (which is $3.99 a month, and it also comes with Outlook 2007 for free). I’m sure other Exchange hosting companies provide the same deal.

I bought the upgrade, and it installed just fine! So–sign up for Exchange email, get Entourage for free, and save $160 on Microsoft Office 2008 for Mac.

January 17, 2008

Autonomy Search Developer Starter

So you’ve got an Autonomy IDOL in hand, and you’ve been asked to build a search application around it.  Here are some thoughts on getting started.

Let’s assume you’ve got the content in.  In a later post I’ll cover some of the fetches/connectors that you have access to, and what you can do with them.  For now, let’s start with a simple query.  Assume that the IDOL is installed on the server search, on port 9000.  Open your browser to:

http://search:9000/action=query&text=*

An installed IDOL listens on many ports: the default port of 9000 is where the IDOL Proxy Service sits and listens.  The response for an “action=query” is to return results that match the “text=” query.  By default, the response will contain up to 6 records, showing the default fields for each records (usually that includes a small subset of the metadata, and none of the content for each record), and will look something like this:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<autnresponse xmlns:autn="http://schemas.autonomy.com/aci/">
  <action>QUERY</action>
  <response>SUCCESS</response>
  <responsedata>
     <autn:numhits>6</autn:numhits>
     <autn:hit>
       <autn:reference>test_document.doc</autn:reference>
       <autn:id>1</autn:id>
       <autn:section>0</autn:section>
       <autn:weight>96.00</autn:weight>
       <autn:database>News</autn:database>
     </autn:hit>
    ...
  <responsedata>
</autnresponse>

Lesson #1: All meaningful Autonomy interaction is through URLs, and the response is typically in XML.  Some simple C# code to handle the response above would look like:

   1: XmlDocument xml = new XmlDocument();
   2: xml.Load("http://search:9000/?action=query&text=*");
   3:  
   4: XmlNamespaceManager nsmgr = new XmlNamespaceManager(xml.NameTable);
   5: nsmgr.AddNamespace("autn", "http://schemas.autonomy.com/aci/");
   6:  
   7: XmlNode node = xml.SelectSingleNode("/responsedata/autn:hit[1]/autn:reference", nsmgr);

The next step is to figure out how to issue queries that are more meaningful than text=*.  For that, we turn to Autonomy’s built-in help page.  You access it by–you got it–going to a URL:

http://search:9000/action=help

Lesson #2: Always have the help URL open on a monitor.  The HTML help that is displayed is the single best resource for questions; and ironically, it isimage not searchable.  Non-searchable help?  From a search company?  Yes.  Perhaps that was left intentionally as a challenge to the buyer to set up their first source…  In any case, your first friend will be the Query node, where you can find all sorts of helpful information on how to build the specific query you’re looking for.  Remember, unless your users are technical, it will most likely be your responsibility to “query cook”, accepting simplified input from your users and creating the complex URL that Autonomy needs.

In future posts, I’ll look at some of the specifics of the query URL, and how to see the impact in the logs.

January 15, 2008

Using LogParser to quantize Autonomy search logs

I explained in a previous post how you can use the Microsoft Log Parser to dice up Autonomy IDOL search logs.  If you’ve exhausted the typical checks for performance problems in your IDOL installation, it might help to narrow down when the problems occur, and look for cluster periods of slow performance.  That’s a great opportunity to use the Log Parser again.

The first step is to level the playing field on timing information; the content GRL and the DAH GRL show duration information in milliseconds mixed with seconds.  I’m sure there is a clever way to correct that in-stream, but I took the brute-force approach: create separate files from first the rows with seconds, then those with milliseconds, and finally produce a single file from the results.

logparser -i:xml -o:csv “select *, mul(to_real(extract_prefix([autn:duration], 0, ‘ s’)), 1000) as milliseconds into over_1_second.csv from http://server:port/?action=grl&format=xml&tail=10000 where [autn:duration] like ‘% s’”

Then the rows under 1 second:

logparser -i:xml -o:csv “select *, to_real(extract_prefix([autn:duration], 0, ‘ ms’)) as milliseconds into under_1_second.csv from http://server:port/?action=grl&format=xml&tail=10000 where [autn:duration] like ‘% ms’”

Then merge the two files into a single file:

logparser -i:csv -o:csv “select milliseconds, [autn:time], [autn:thread], [autn:status], [autn:action], [autn:request], [autn:client] into merged.csv from *.csv”

These three basic steps serve as the basis for most log analysis I do, so I’ve added them into a script.  The result, merge.csv is a flattened file that contains the data we need.  If you are going to script this, don’t forget to escape the percents, i.e. like ‘%% s’.

Next, we run a quant operation on the logs.  I’ve found that a half-hour period makes for a good range to view the average query performance:

logparser -i:csv -o:csv “select quantize(to_timestamp([autn:time], ‘dd MMM yy hh:mm:ss’), 1800) as period, avg(milliseconds) from merged.csv group by period order by period”

The 1800 constant there is seconds, i.e. half an hour.  The result is a list, here’s a short snippet:

Period Duration (ms)
2008-01-02 22:00:00 624.828913
2008-01-02 22:30:00 415.648974
2008-01-02 23:00:00 2410.331818

This report shows clearly that around 11pm, we see a sharp decline in performance.  You might also want to add a count(*) clause to the query to highlight the system activity.

January 15, 2008

Microsoft Office 2008 for Mac is here…but not on MSDN

Not yet, not in my download list, at least.  I can’t find any authoritative statement indicating whether it will ever be available as an MSDN download, and if going off of previous versions is any indication, it never will be.  I’m interested to see what development options exist; I haven’t been able to find anything yet.  There’s a definite paucity of information and a lack of responsiveness on the MacBU side of things.  I’m hoping that changes over time.

January 13, 2008

Using the Microsoft Log Parser to parse Autonomy Logs

Much has been written about the free Microsoft Log Parser, a simple command-line tool that can access and parse log files from a number of sources, execute SQL-like queries against that data, and present results.  Did I mention it’s free?

Autonomy services will drop log files everywhere (literally all over the place), in different formats, and the challenge is to merge all that log data into a single store in order to get a handle on the big picture.  For instance, a single query against the IDOL server shows up in several logs:

  1. the GetRequestLog
  2. content_index.log
  3. possibly the OGS query log (if you have securityinfo)
  4. possibly a DAH log (if you’re distributing/mirroring)

How can you aggregate all that information to get a single picture for performance analysis and forensics?  And what about aggregating in other trace information, like application trace logs and IIS logs?  Use the Log Parser.  The first example I’ll give here is a simple query against the GRL–that should provide a view of the current queries that the IDOL is servicing.  I am using the latest version of the Log Parser (2.2, from Jan 2005)–download and install, then either copy to your %SYSTEM32% path, or simply add the install directory to your path, and run the following from a command-line:

logparser.exe -i:XML -o:DATAGRID “select [autn:action], [autn:request], [autn:client], [autn:time], [autn:duration], [autn:status], [autn:thread] from http://server:port/?action=grl&format=xml”

That opens a pretty little window for you to scroll through.  You can modify the url with “&tail=[somenumber]” to return a different count of rows (the default is 100).  There are a couple of parameters for the output type (DATAGRID), one is the autoScroll, which is on by default.  This scrolls whenever new data shows up, but does not work with URLs, so you will have to re-run the command-line to get an update.

Let’s look at a slightly more complicated query.  I’m working with a client on query performance, and we’re studying why certain queries take longer than others.  Most queries take under a second, but every once in a while, they take longer.  With a simple query, we can look at exactly the information we need:

select 
    mul(to_real(extract_prefix([autn:duration], 0, ' s')), 1000),
    [autn:request]
from
    http://server:port/?action=grl&format=xml
where
    [autn:duration] like '% s'

We limit this to rows with a duration in the format ’1.62 s’, then turn the value into milliseconds.  Removing the [autn:request] column from the select, and surrounding the mul() operation with an AVG() gives you a handy number on average query time over 1 second.  Make sure to add a more meaningful depth, with something like &tail=10000 to your URL.

I’ll look at more complicated queries next.

January 4, 2008

Microsoft Office 2008 and Entourage

One of the tools I cannot work without is Outlook.  Over the years I’ve tried everything from Thunderbird with Sunbird to some really weird products like Chandler and Omea, that never really found the sweet spot that Outlook hits.  I wanted to be free of the Exchange lock, which while powerful, is a real pain to share outside of the corporate environment (share a view of your calendar with friends and family?  Not a chance…)

Naturally then, Outlook is one of the driving reasons behind maintaining a VM/Boot Camp partition for day-to-day work (obviously, the anchor is Visual Studio/.NET).  I looked at Entourage from Office 2004, but was not entirely impressed it–I have grown dependent on RPC/HTTP (can’t stand the thought of VPN any longer just to check email), and that’s flat out not supported by Entourage (not in any of the versions I tried, at least).

So I have high hopes for Office 2008 due out this Jan 15.  I’m hoping for closer feature parity between Entourage and Outlook 2007 (where will I find something like ClearContext?), especially with respect to RPC/HTTP.  And finally, I’m hoping it shows up on MSDN–does anyone know if it will?

January 4, 2008

VMware Fusion

Can’t say enough about this software.  I was able to start work on the VMs that I had been using in Windows without any problems–a straight copy over and I was up and running (perhaps Parallels can open VMware files–but that was enough for me to stick with Fusion).  And with the "Unity" feature, it just doesn’t get any easier to work with Windows apps while staying in OS X.

I needed to keep my email in Outlook 2007 (Entourage didn’t cut it, there’s no way I can see to make it work with RPC/HTTP, and the online commentary is that it uses WebDav–well, we don’t). I typically work with at least two VMs open, one of which is a VM of my Base Camp partition (hurray for 4G on a MacBook actually meaning 4 Gigabytes of RAM).  Voila, hit the Unity menu, and you get the screen below.  I am missing some of the cut-and-paste I got between Windows VMs on Windows, but that may just be a configuration setting somewhere.

January 2, 2008

Power Adapter Woe with MacBook Pro

I have had the Kensington 120W Universal Power Adapter for some time now, along with a plethora of bits and extensions–it’s the only power supply I take with me in my laptop bag, and I charge the two laptops, the Blackberry, bluetooth headset, iPod, PSP, and just about anything else I take along.  It’s slim, light, and is a great replacement for the brick that came with the Dell M90 (though the BIOS won’t let you boot off the adapter because it is less than 130W).  Then came the MacBook.  That takes a special power adapter, the MagSafe, which they have not (yet) licensed to any third-party manufacturers.  That means no Kensington bits, no iGo bits, nothing. I’m stuck with an amusing little chunk of a power supply, and nothing to work from in the car or on a plane.  I found the MagSafe Airline adapter, but it expressly notes that it will not work in cars.  I have not yet found a way to power the MacBook from a 12V port.  Little things like this remind me of the tradeoff I evaluated back in college: I picked PC’s because I could muck with them, buy pieces and parts from the local store, while the Mac folks had to find someone with that special case cracking tool.  Time to get a bigger bag.

January 1, 2008

MacBook Pro from a .NET Developer’s Perspective: Part II

In the first post, I mentioned that in order for me to keep OSX on the new MacBook (which I want to do), I had to be able to work through six different tasks without too much trouble, or I’d have to switch to using Vista on the laptop.

The first three tasks were easy to work through. The fourth proved to be very disappointing: being able to remote into the laptop using something like Microsoft’s mstsc.exe (terminal services client) to work on the laptop from the comfort of my desktop, and the 3 large flat screens.

I’ve got a wired gigabit network in my house: each room has 3 drops, and I make good use of that network. So imagine my surprise as I went through VNC client after client, along with three different servers, trying to get a decent remote desktop view of the MacBook. I tried:

  • The built-in Remote Desktop service with a handful of Windows VNC clients–absolute rubbish. I’ve had faster screen redraws through RDP over a 56k modem.
  • Vine Server: after reading reviews, I had high hopes for this VNC server (server is free, client is not). I used several different VNC clients on my Vista machine, and each one was a failure. The experience in mstsc.exe is nearly indistinguishable from being at the console (especially over the gigabit network), but on VNC it’s like a bad dream. I’m sure there are ways to configure both client and server, and I tried dragging down the visuals to the lowest possible setting: so the screen looked ugly and it was still slow.
  • RealVNC: supposedly the best of them, and not free (for a single license, $50 US). A demo license showed nothing different than Vine or the built in service.

And what a disappointment! Dragging windows around is just about the most painful thing I’ve ever seen–so scratch one against keeping OSX on the laptop. There’s no way I could work on my laptop from my desktop.

Now, for the workaround: since I use VMWare, I just start all the VM’s I usually use (all Windows Server 2003 instances), then RDP into them from my desktop. In reality, I don’t need access to the OSX desktop, I just need access to what’s running there: and in my case, that is 2-3 Windows 2003 Server VM’s.

There’s hope yet. Unfortunately, some of the VM’s run VPN software that disables all local routes for security, and that means RDP is out the window. In that case, I’ll simply migrate the VM to another machine on the network when I bring the laptop home.

I’ll keep looking for a VNC option for OS X, but I doubt I’ll find anything half as good as Remote Desktop in Windows.

#5 (backups) and #6 (applications) up next.

Follow

Get every new post delivered to your Inbox.