Barnsley fern fractal

Thoughts on software architecture and development, and methods and techniques for improving the quality thereof.

David B. Robins (home)


Like Code Visions on Facebook.

Code Visions: Improving software quality
Generating GSSAPI tokens from arbitrary Kerberos logins

By David B. Robins tags: Development, Unix Saturday, April 12, 2014 23:22 EST (link)

Forums such as StackOverflow are very helpful for providing solutions and code samples for shallow issues; but when one needs to dig a little deeper, they tend not to have answers. Of course, this is why they pay us the big bucks: to do the required research to make all the various libraries play nicely together. This is such a story.

The problem: given a username, an associated Kerberos domain (e.g., a Windows/Active Directory/LDAP domain), and a password, obtain a GSSAPI token suitable for logging in to a service. The service will use gss_accept_sec_context (Linux/Mac, hence "Unix", GSSAPI) or (Windows SSPI) AcceptSecurityContext to validate the token. Existing code was able to fetch a token for an arbitrary username and password using the Windows APIs, which made it easier, but GSSAPI didn't support it directly so instead only "single sign-on" using credentials from a cache (set via kinit) would be supported on Linux/Mac. The accept code, however, would accept any valid token, if we could figure out a way to generate it.

Another dev made a first attempt at generating a token, going to the MIT Kereberos library (libkrb5) directly since GSSAPI didn't expose usable APIs. He did get Kerberos to cough up a token using krb5_get_init_creds_password (and the obvious supporting functions to create a context, parse the server name, etc.) but the server, using the API functions mentioned above, didn't accept it: Windows servers said SEC_E_INSUFFICIENT_MEMORY. He passed the problem on to me, and I got the same error. I made sure my Linux client machine was joined to the domain (using tools called "Centrify"); same result. I played around with different Kerberos calls, e.g., setting up a credentials context with krb5_init_creds_init and adding the server principal and password separately; no dice.

Eventually I asked if there were Linux servers among the servers that were configured for Kerberos logins; and it turned out there were. They coughed up a slightly different error, saying I had an invalid token. Well, that made more sense than E_INSUFFICIENT_MEMORY and might allow for some progress. I set up a stub program on the same machine as the server (wanting to vary as little as possible), using the same token validation code, and unsurprisingly it gave the same error: but it was a lot easier for me to build and link it with a debug version of libkrb5 that I could step into with gdb (our server's watchdog also likes to reboot the machine if one halts in the debugger too long; that can be disabled, but it's an obstacle). Being able to step into libkrb5, I could see it was rejecting the token because it didn't begin with the magic value 0x60, and I tracked down a function that could properly encapsulate the token with the required RFC 4121 framing: gss_encapsulate_token, by dint of grepping around the source. This got me further, but unforutnately gss_encapsulate_token purposely left off a field called the "token type" which was necessary, so I had to write my own version that passed the correct type to the internal wrapper, g_make_token_header, to make another internal function, g_verify_token_header, happy.

That got me further, but a function that decoded the payload, which was supposed to be ASN.1 DER formatted, threw back an error, since my payload was the raw Kerberos credentials. I poked around the source some more, and found something that looked like it might wrap things up properly, and even a relevant-seeming public interface: krb5_mk_req_extended. I still had some #if'd branched versions of earlier attempts; but it turned out that the krb5_init_creds_init version, with krb5_init_creds_set_password and krb5_init_creds_set_service, was the ticket, so to speak. I pared the code down to what was absolutely required, removing unnecessary calls and experimental branches, and passed it back.

I'll also be integrating it into the shared authentication code, now that the "magic spell" is known. It was certainly a great help to have the Kerberos source, given that the (protocol) documentation is so sparse and generally unhelpful. Basically, it was a framing issue: the raw Kerberos token needed to be wrapped in an appropriate ASN.1 description and then that in a GSSAPI header. That perhaps explains why the Windows implementation complained of insufficient memory: intepreting the credentials as a GSSAPI token might have meant interpreting some credentials data as length fields, which may have contained invalid data making the length appear impossibly long (hopefully SSPI just rejected it based on the token length rather than trying to actually allocate memory).

Spooky action at a distance Python bug

By David B. Robins tags: Python, Bugs Wednesday, March 12, 2014 21:56 EST (link)

This was a fun one: when used with a new kind of camera, our web service crashed, and only on Linux. Technically, not a new camera, but a new library (provided by the manufacturer) that "de-warps" a fisheye (very wide-angle) lens using software, straightening out lines curved by the lens and providing (in some ways) a more natural view. This is part of a feature known as "digital PTZ" (PTZ = Pan, Tilt, and Zoom), differentiated from physical PTZ which allows for physically moving cameras to alter their viewing angle by sending commands to internal motors.

The Exacq web service (link goes to our demo server) makes use of the Exacq SDK (evAPI), which I am responsible for; and it is the API code that wraps the manufacturer-specific dewarp libraries and invokes them for the appropriate cameras when requested. Thus, the web team asked us to investigate, since the crash happened within an API invocation. It took a bit of time to get setup to debug the web server, by attaching, but I eventually figured that out. I asked for, and the web team provided, a standalone Python repro, which made investigation easier.

Early on I figured out it had something to do with the Python ctypes built-in library, which makes it possible to invoke functions in C libraries (our API is natively a C library, provided as a DLL or shared object). The Python 2.6 from the Ubuntu "deadsnakes" PPA (my local Python was 2.7) did not repro the bug, and nor did 2.7; and it was possible to just switch out _ctypes.so to make it go away. So I started looking in later Python 2.6 releases than the web server used to see if there was a fix. No dice: and with the Python 2.6 versions I built myself, the issue persisted. I tried using strace to see what the working Python did differently, and nm to see if there were any important different imports or exports, but got nowhere.

I went back to the debugger, gdb: when I had first looked at the crash location, it had looked like it was accessing unallocated memory below the current stack (via info proc mappings); but either I had calculated wrongly or miscopied something, because looking later showed the destination of the instruction (movapd %xmm0, 0(%eax)) as valid memory (verified by checking the process memory map, or p *(int *)$eax, although that's just the first 4 bytes of it). I ran the succeeding and failing cases in separate (GNU screen) windows, again. I wrote down the addresses being accessed, and a light came on: the SSE XMM registers are large: what are their alignment requirements?

And so it was: the SIGSEGV failure, coming from a processor general protection fault, was indeed an alignment error: the instruction required a 16-byte aligned memory location, and it wasn't. I had even gone to the trouble earlier of creating a Python C extension that called the same function, bypassing ctypes, which succeeded: and when it did, so did the call via ctypes (it fetched the dewarped image), most likely because the first call, judging by the names of the internal functions (we had symbols but not source), initialized the library by populating some tables that only needed to be set up once. Once I knew what to look for, I found the Python bug, which had only been fixed in 2.7 and 3.2+.

So, lessons? Check the reason for a segment violation (available in $_siginfo in gdb, although it might not have been helpful since alignment errors are expected as SIGBUS), and double-check diagnoses, like the initial belief that the memory address was bad. Isolate the fault as best possible (I found it was ctypes fairly quickly).

We resolved to leave this dewarping library, which was one of several others already in use, out of the Linux release until the web service upgraded to Python 2.7, which was already planned, and possibly issue a mid-cycle release update (outside the normal quarterly channel) if there is sufficient business justification (read: people willing to give us money for it).

All aboard the D-Bus!

By David B. Robins tags: Development, Python Tuesday, January 28, 2014 19:17 EST (link)

I'm sure the title's been repeated numerous times at this point all across the Interwebs: but this is my site, and I like it, so we're going with it.

A little while ago, I was looking at Celery, an interface to distributed message queues. But after ruminating on it for months and not making much progress, I have to confess that queues are a bad fit for the kind of RPC I was looking to do (COM would be a good fit if I were developing on Windows). I decided to give D-Bus a shot, using Python, since the service I wanted to share is in Python. An actually decent Python DBus Tutorial actually lived up to the name, and I wrote a server around it that wrapped some functions in a module I already had.

The reason I wanted to run the module as a service was that it cached some data that I didn't want to continually reload (even from a file). So I provided a simple interface (the details aren't terribly important, but it takes an approximate name and returns the best good match and a related id on success (a tuple), None on failure. It uses Levenshtein distance, or, rather, a ratio (to length) to avoid certain pathological results. To allow for auto-updates, when it is run it looks for an existing object of its kind on the bus and sends a quit request.

To start the local bus for a user, I used a solution which I can't find now that stores connection settings (DBUS_SESSION_BUS_ADDRESS and DBUS_SESSION_BUS_PID) in ~/.dbus_settings, stored there or executed by .bashrc. I then hooked up the new service to a frequently run program that usually had to hit the Internet and do a slow and inaccurate search; worked great; accuracy and time both improved. That isn't everything I wanted to do, but making decent progress after so long a disinterested hiatus at least indicates to me that this technology is a better fit for the problem rather than the queues rathole.

Burndown chart for Trello

By David B. Robins tags: Business, Tools, Architecture Thursday, December 12, 2013 21:04 EST (link)

I've mentioned previously that we use Trello to manage work items and sprints (we use three week sprints). Another manager got me to start using Scrum for Trello, a browser plugin that allows for setting estimates and accruing work on task cards; it shows them as card annotations and conveniently totals up estimates and work done for lists and boards. For the current sprint board, our lists are "Backlog" (not yet started, but, of course, estimated), "Developing", "Testing", and "Done".

I found the "Burndown for Trello" app, but really didn't like it; the unpaid version doesn't work with Scrum for Trello, and their help doesn't show any charts, so I gave up on it as, if not bad, then unknown and untestable. Our ESM (Enterprise Server Manager) product shows charts, so I asked what library they used; they use (not wanting to go through the painful process of finding an actively developed Javascript library that did what I wanted if I someone had already evaluated the field) jqPlot, and it looked great; super-flexible and makes beautiful graphs.

Design-wise, I went with the Unix philosophy: small components that each do one thing well. I broke it into:

  • burn, a program to read a Trello board and emit a snapshot of current total estimates and work done,
  • daily_burn, a program to be run via cron at midnight that uses burn to add the day's snapshot to a database,
  • data, a web app returning said database (for AJAX callbacks),
  • graph, a web page that displays the current chart.

Regarding specific technologies, I went with Python (3) for the first two apps, Apache as the web server (which was already running on the web server machine, which serves as a rotating build/task status display), Python over WSGI for the web app, and of course graph uses Javascript, CSS, and HTML (5). Besides jqPlot (and jQuery, obviously), I pulled in a date library Moment.js and several jqPlot plugins. (Incidentally, at some point after I had this running, the IT department at the large company that recently acquired us started expanding their web site block list for whatever reason, including blocking a category called "Content Servers", which included CDNs serving up Javascript libraries such as the ones I got jqPlot and Moment from. They unblocked it fairly quickly, for a large company; I know I wasn't the only one affected.)

Data-wise, burn emitted YAML, and daily_burn just added it to a file; it was set to skip weekends and holidays (if someone did work on those days, it would count toward the next day). I used the jqPlot date axis plugin (without it, for some reason, all dates piled up on top of each other, maybe because numerically "Wed Nov 27" evaluates to 0?) The data app just converted it to JSON (the fact that conversion was necessary at all was only because timestamps were stored, since the two formats are compatible for basic structures). An AJAX callback fetched an array of data, and a few simple loops created points for jqPlot.

I did make a few changes after I had it working for my team so that a neighboring manager could use it (and maybe others later, since using Trello is spreading): I moved board and sprint (date, length) information to a configuration file, and allowed for a parameter to refer to different configuration and data files. This had always been designed in to be possible, so it was just a matter of removing some shortcuts put in during development.

Cameras in 3D with Cinder

By David B. Robins tags: C++ Friday, November 15, 2013 18:40 EST (link)

In his "One C++" talk for GoingNative 2013, Herb Sutter challenged viewers to take a little time and produce a small application using the Cinder C++ graphics library.

I decided to combine libcinder with the exacqVision API to make an app for our weekly Friday demos (food is served, and non-engineers are encouraged to, and do, attend). At some point, I'll also add it as a downloadable sample for our customers. I started with the "Picking3D" sample included with Cinder, which features a duck and allows "picking" a point on its surface, and played around with it until it was drawing what I wanted and I had re-accustomed myself to OpenGL and manipulations in 3-space (it's been a while). One thing that was convenient with the sample was that it showed me how to use an included class that allowed manipulating the camera view through the UI by dragging.

My first idea was a "control room", a rectangular room where a hypothetical security guard might sit; and he's looking at a bank of monitors. Except instead of monitors, the walls themselves are screens showing various camera views. After selecting the coordinates of the room (arbitrary, so I picked some convenient +/- multiples of 100), the main trick here to figure out how to map camera images to said "walls". As I figured, they would have to be textures; and our code already knows how to stretch an image to a given size while maintaining the aspect ratio. I decided on two threads: one for the connection to the exacqVision server with the cameras, and another for the UI, as is fairly typical. They communicated via a std::mutex (fortunately, someone had just presented on C++11 concurrency during our developer lunch talks). I expect if I had had YUV images at that point (they were not added to the API until later), I would have used them as they are supposed to be more efficient for graphics hardware. However, with images on left, right, and back walls and the ceiling (probably 30 fps, but it was allowed to drop frames) speed was acceptable on a VM.

After getting basic display working, I changed the camera textures to fill the frame rather than keep "black bars", cropping whatever didn't fit, since this looked better. I also added controls to move the camera view forward as well as the existing dragging to rotate the viewpoint. To make sure my demo would go smoothly, I took the program, statically linked, and a copy of the API and ran it on the demo machine, a new small footprint box in the conference room; it ran fine, about the same as on my VM (both Windows 7). Always run your demo on the precise setup you'll be using at go-time, or if you can't, as close to it as you can get! It was pretty cool seeing people walking around on the mapped camera views, or the one view of the parking lot that automatically panned back and forth (PTZ presets).

At this point, I thought up a variation: rather than a box, showing virtual "monitors" spaced around the inside of a sphere. I forked the existing code, since I had some handy innovations such as an object wrapper around the API handle (RAII and allowing for an overridable callback method), and the background worker thread. I started by placing textured rectangles around the sphere, careful with the transformation order (rotation and translation can go horribly wrong if the matrix multiplication order is mixed up, or rotation about axes is done incorrectly). I added a simple wireframe box spaced a little away from the textured rectangle to make it look more like a monitor (I had originally planned to use an image of a physical monitor, but decided that it didn't add that much and would be a bit messy given that it would have to be cut into four rectangles around the outside of each camera feed). I also drew axis circles to show the bounds of the sphere. I set keys to toggle a profusion of monitors rather than the few shown by default, and to set the view to rotate continuously about the y-axis. It was significantly slower with many monitors, of course; and I wasn't doing anything clever about not calculating textures for monitors not in the view. I also added more navigation keys to "walk" around the scene.

The demo went pretty well; it wasn't new functionality, of course, as is more usually shown, but there were some semi-serious comments about showing it in our booth at the various shows we attend; there may be a corner for such a display if it could be dusted up a bit, made generally interactive, and add an exacq logo or two, if it can be connected with selling, or at least enhancing awareness of, our API and how it can be incorporated into any desired UI, or integrate with a company's products.

Two web tools: FSReviews and Bugzilla Improved!

By David B. Robins tags: Automation, Tools, Meta, Architecture Friday, July 19, 2013 20:52 EST (link)

A couple of the internal tools I developed while at Freedom Scientific were web-based; I ran an Apache/mod_wsgi (Python 3) server on my Windows development box. One was a code review tool, which I called FSReview: code reviews were done in an ad hoc fashion over email, but those emails tended to get lost, especially when people were busy. This tool allowed one to select a project, and from its home page, create a review from a Perforce changelist (which was auto-populated through a dropdown, showing the last several changes made by the authenticated domain user, with descriptions, or a value could be typed in manually). The description would auto-populate in a text field; a note could be added, and then when the user clicked the Add button, the review would be added to a list of reviews.

The main screen, beside the Add Review button, had two jQuery accordion widgets: on the left, changes waiting for review, and on the right, changes being reviewed. Expanding an item showed more details; the header had the change number, a short description, date, and author. Anyone but the author could click Review on an item on the left, and it would move to the right side; from there, a review could be abandoned (returned to the review list for someone else to review), or completed (where it disappeared, although was still tracked in the database—Postgres—and could have sent a report, etc.).

Since the method of review had not changed—still email, except that requests could be added to the web page instead—there was a button on the open review that popped open an email addressed to the author, CC'ing a manager, with an appropriate subject and the description inserted at the end of the body. It was a fairly neat hack, and a chance to learn jQuery. Although I had no power to require it, my co-workers began using the tool due to its convenience (until eventually told not to by management; I think they wanted to shell out big bucks for something, although they hadn't chosen anything when I left). As with email, a rather large backlog grew; but now it was obvious, rather than buried in email. Since we committed before reviewing (which was different from Microsoft, where review was required first), I wasn't blocked; but it did become an interesting question to ask my manager: was it important to get everything reviewed?—in which case, why wasn't he asking people to get on it—or not worth bothering, which belied the claimed importance of them (as did he not getting his own changes reviewed, resulting in a few subtle and nasty bugs).

Another useful, but fairly trivial, thing the tool did was to convert text of the form "bug #n" into a link to Bugzilla, which was convenient for associating the bug with a fix. Sometimes the easy things—a small regular expression!—have great yields (and of course, the corollary is that the work on the 90% of the "iceberg" below the water often goes unseen).

The second tool I dubbed "Bugzilla Improved!"; it worked as a proxy for the local Bugzilla server. I wanted the "status whiteboard" field to be turned on so I could note the status of a bug (e.g., waiting for a tester) (at Microsoft, we would just assign the bug to the tester and they would assign it back when they were done with whatever had been asked; but that was not policy at FS). However, they had some local incompatible changes that made it impossible to do so (at least, for about a year), so I resolved to create my own version, storing the information in a local database.

To be able to tell at a glance that I was seeing my version with enhancements, I added a green "Improved!" after the Bugzilla title. I made these page changes using XSLT (which is very handy in the right circumstances). The other changes were of course functional: adding the "Status whiteboard" type field's title and control to the list, etc. The data would be inserted from the database or extracted and inserted into it when the appropriate accesses or submissions were made, tracked by bug number as primary key.

Eventually, as I noted, the conflict with our local changes was resolved and the built-in "status whiteboard" was turned on. I had not, at this point, completed the work to search my "phantom field", although I did have a design sketched out, although it may not have worked well with boolean operators.

Burned by an ancient compiler

By David B. Robins tags: C++, Testing, Compiler Sunday, July 14, 2013 18:15 EST (link)

Where I work we're using Microsoft Visual C++ ".NET" 2003 (aka version 7.1, or compiler version 13) (and equally old compilers to build on Mac or Linux, but that's another story). We recently got burned by a backwards compatibility "feature" in VC++ 7: by default, the scope of variables declared in for loop initializers extended beyond the end of the loop. For example, in:

for (FOO_LIST::const_iterator ifoo = list.begin(); ifoo != list.end(); ++ifoo)
{
    // ...
}
assert(ifoo == list.end());

The ifoo variable was accessible after the closing brance, and the assertion would not only compile, but succeed.

Now, imagine a slightly different version of the above loop:

for (FOO_LIST::const_iterator ifoo = list.begin(); ifoo != list.end(); ++ifoo);
{
    // ...
}
assert(ifoo == list.end());

Did you catch it? It's a not-uncommon error: there's a semicolon at the end of the for loop line, turning it into a loop with an empty statement, which will leave ifoo pointing at list.end(). The braces enclose a new scope, rather than the scope of the for, in which ifoo should point at a valid list element, but instead it points past the end of the list elements.

Although the error is common, it's difficult to see it when it's also part of code that should have failed to compile. Since VC++ 7 accepts a language that's not quite C++, it throws off an investigation. Fortunately, there is a way to enable standard-compliant for scoping: /Zc:forscope (or, in project Properties | C/C++ | Language, set "Force Conformance in For Loop Scope" to "yes"). In VC++2005 and later, this option is on by default. We are planning to upgrade to 2010 (not 2012; "too new"); but until then, we'll enable the standard scoping option.

Using the Trello API

By David B. Robins tags: Automation, Python, Tools Sunday, June 30, 2013 17:45 EST (link)

I'm using a couple of Trello boards to manage team projects and tasks at work. The main board uses the default three lists: To do, Doing, and Done. Done was renamed to "Done (this month)", so it wouldn't get unmanageably long, and the second board has lists of finished items by month. I also have a machine (an unused one from home) setup as a dashboard in my window that flips between the main page, the done page, and a build tool (Jenkins) status display.

I had read a little about Trello's API when I was learning about it generally. Interestingly, they are working on moving the web site to be a consumer of their API much like we are planning to with our API and GUI client (the web client already uses it). It is a simple yet featureful RESTful API, so it was fairly easy to write up a Python program that, given a list ID (found using one of the sample URLs that dumped out all of "my" cards), dumped out the cards in it, using the GET /1/lists/idList/cards request. From there, moving a card was easy, with one minor hitch: PUT /1/cards/idCard/idList (passing the destination list through the JSON payload) did not work, possibly because the list ID is board-local, although it worked fine without a board for getting cards. Instead, I had to use the version that explicitly moved to a new board: PUT /1/cards/idCard/idBoard, passing board and list IDs.

There are at this point three Python interface wrappers, but none of them installed and worked (with pip) for Python 3. I did take a look at the code of one to try to resolve an issue with my own module. To be honest, there's not a lot of wrapping needed once you have a function that can do a web request (urllib.request) and JSON conversion (json). I did bump to Python 3.3 so that Request.method was available.

Still, it was a relatively painless experience; I'll make a few changes to the program to create the target list if not found and move based on the current date, so that it can be run at midnight on the first via a cron job. I could archive the finished cards, but I prefer the organization of having them on another board by month. Plus it's good to be able to look at team accomplishments laid out over time.

Generating a C# module from a C header

By David B. Robins tags: Python, Tools Saturday, June 22, 2013 16:57 EST (link)

It's been a while since I've made an entry here; chalk it up to moving from Florida to Indiana for a new lead engineer position, responsible for API and integrations for the company's video management system (API). It's a great new position, but it and general moving errands (we got our Indiana driver's licenses today) has kept me busy.

We release the API and a collection of samples with a C header file and a C# module definition file (which does almost no wrapping at this time: just straight DLL imports and translation of structs and enums). I almost forgot to add some new sensor-related APIs to the C# module; and that let me to think that we really should be generating the C# module from the primary source, the C header file, or both from a single source, such as a YAML or XML definition file. Since the C header file is very regular, I decided to walk through it and emit the C# module.

The general architecture of the translation program, written in Python, was a state machine that translated as it went. Since the C header file is expected to be regular, that is, limited to certain types and forms, I freely bail with an exception on anything not understood. States are defined using the "automatic enumeration" answer from this question about enums in Python, as follows:

State = enum('NONE', 'IFDEF', 'COMMENT', 'ENUM', 'ENUM_ANON', 'STRUCT', 'FUNCTION')

#ifdefs, except the header guard (which should be replaced with #pragma once anyway) are skipped; we add our own static usings appropriate to C#. Since the old static C# file groups the module into structs and enums, followed by (public static) functions, which are grouped into a class Library, whereas they are grouped by topic (e.g., sensors, targets, recording) in the C header, the program needs to emit two output streams which will later be joined (or store up output in a string, which I prefer not to do). Each function has a preceding comment, which needs to stay with it, so comment output is "pending" (basically a one-level buffer) until the next item is determined. This allows a function's comment to be emitted with it to the function stream. (Since we have to support Visual Studio 2003 for now, partial classes are not an option; anyway, closing and reopening the class that many times is ugly.)

Anonymous enums are a special state because their first item is used to provide a name for the C# enum; and then the state transitions to the regular ENUM state. These anonymoys enums should be given names in the C header anyway; it won't change the scope of their values, so it would be a backward-compatible change. Unfortunately, for C and C++03 compatibility we can't use enum class.

The STRUCT and FUNCTION states have to parse out the members and return value/arguments respectively so they can be mapped to C# types; since the types used are limited, a Python dictionary suffices; the only trick is to space-separate * and [] consistently.

Generating the C# module like this—for now manually, soon as part of the build—adheres to the DRY (Don't Repeat Yourself) principle, making the C header the "Single Point of Truth" for the interface definition. It also makes for one less location for error-prone manual translation and update.

Rosetta: plug-in file format architecture

By David B. Robins tags: C++, Development, Architecture Friday, April 12, 2013 10:22 EST (link)

The last project I worked on at Freedom Scientific was a proof of concept to add support for external file formats. It began as writing up a design document; I looked into a library called Aspose.Words for .NET, and started playing around with it; it seemed to do exactly what we need, in that it provided open and save for: Microsoft OpenXML documents (.docx), EPUB, HTML, RTF, and more; it also let us dispense with another library (Inso) that only supported a few formats and was becoming a bit ossified.

The requirements were that the plug-in formats be completely removable and only add dependencies (e.g., to the aforementioned Aspose.Words) when they were actually being used. These would be used by OpenBook and WYNN, which were built around the same code base. Architecturally, it made sense to have plug-in DLLs with a well-known entry point (Rosetta_RegisterPlugins) which would be given a registration interface that it could call and pass in an interface (Rosetta::IFileFormat) for each file format supported. This interface supported a few methods (following my minimalist design):

  • fetch the file format description and extension;
  • whether open was supported, and open method;
  • whether save was supported, and save method.

An existing "parser" interface was leveraged (use what you have), and became the base for Rosetta::IBuilder, adding on page navigation; pattern-wise, it was a builder anyway. This interface was passed to the open method and allowed for building a native document. For example, with Aspose.Words, their DocumentVisitor interface was implemented and became essentially a straight translation, although WYNN supports only a very small subset of the formatting of the various import formats (Aspose.Words seems to store documents internally much like Word documents, or at least preserves a Word-compatible interface).

For save, the same IBuilder interface was used, except this time, the plugin file format object implements it rather than uses it, and passes its implementation to a Rosetta::ITraverse interface passed to the save method. Internally, WYNN visits all pages and elements in each page and invokes the appropriate IBuilder methods, allowing the implementation to appropriately output the content in its own way. For Aspose.Words, its DocumentBuilder was used to build up a document.

Originally, the plan was to use COM interop to talk to Aspose.Words (a .NET library); but that was extremely difficult due to the need to pass in .NET streams and use constructors with no equivalent on the COM side. Eventually I tried out C++/CLI, and was pleasantly surprised that, even in ancient Visual Studio 2005, it "just worked" in ways that few technologies do.

At first I wrote a test plugin, supporting .test files that were just text files with a header and some formatting to allow for multiple pages. This allowed me to work through any possible issues and then move to the Aspose.Words plugin. As mentioned, I used their DocumentVisitor and DocumentBuilder, and a single implementation worked for all the formats needed, only changing the IFileFormat class to have appropriate extensions and descriptions and Aspose save/open enumeration values for the file type.

Only EPUB needed extra work: Aspose does not support opening EPUB files, just saving; so I wrote some glue code making use of the Xerces XML parser (and a wrapper I had already written for our Notecards file format, which could be used almost exactly as is since it was sufficiently general) to read the "spine" and then used Aspose to open the XHTML content files and build up the native document from them with IBuilder as usual.

Plugin discovery works simply: we check a Rosetta folder in the same directory as the main executable for *.dll, open matching files, and try to register them. Plugins are only unloaded when the application exits; we do not dynamically unload, although that could be an option for the future if it were necessary. Registered IFileFormat interfaces are queried for the extensions they support and a dynamic file type is assigned and they are added to a map for future reference by the functions that map extensions to file types. There is a function that creates IIO interfaces from an extension; an ExternIO class was created that provides the last piece in the puzzle to bridge from WYNN's file I/O to IFileFormat.

/ previous entries /

Content on this site is licensed under a Creative Commons Attribution 3.0 License and is copyrighted by the authors.