The second piece of the PGXN infrastructure, after [PGXN Maanager](http://blog.pgxn.org/post/4854707157/pgxn-manager "About the Infrastructure: PGXN Manager"), is the [PGXN API Server](http://api.pgxn.org/). I've just finished the [API documentation](https://github.com/pgxn/pgxn-api/wiki), which covers both the lightweight static file API provided by mirrors and the superset provided by the API server. So now seems like a good time to talk about the design of the API server and how it works. At its core, the PGXN API server is just another mirror. It has an hourly cron job that `rsync`s to the master mirror, updating the mirror. But then it iterates over the `rsync` log and transforms some things. Here's what it does: * Unpacks each distribution in a directory for browsing. Here, for example, is where one can [browse the semver 0.2.1](http://api.pgxn.org/src/semver/semver-0.2.1/) sources. * Searches for a `README` file and any files recognized by [Text::Markup](http://search.cpan.org/perldoc?Text::Markup) and converts them to sanitized HTML with a table of contents. Such files can then be used to display the `README` on the [distribution page](http://pgxn.org/dist/semver/ "semver distribution page") and to display [individual documentation files](http://pgxn.org/dist/semver/doc/semver.html "semver documentation"). * Merges the distribution metadata file with the latest stable release `META.json` generated by PGXN Manager. For example, as of this writing, the API server's [semver 0.2.1 `META.json`](http://api.pgxn.org/dist/semver/0.2.1/META.json) and the unversioned [semver.json](http://api.pgxn.org/dist/semver.json) are identical. Effectively, this format has all the metadata from the `META.json` as well as a list of all releases of the distribution from the `semver.json`. This is useful for displaying all the data on the [distribution page](http://pgxn.org/dist/semver/ "semver distribution page") by fetching the data in a single API request. * Updates all other versions of the `META.json` file. For example, if you look at the [semver 0.0.0 `META.json`](http://api.pgxn.org/dist/semver/0.2.0/META.json), you'll see that it includes 0.2.1 in its list of releases, even though 0.2.1 was released after 0.2.0. This allows [semver 0.2.0](http://pgxn.org/dist/semver/0.2.0/) page on the main site to have a select list of version to choose from, including versions released later, with a single API request. * Adds additional metadata to the extension JSON file for all extensions in the distribution. The added data includes release dates for the list all distributions providing the extension, as well as an abstract and doc path for the latest stable release. To see the differences, compare the [mirror `semver.json`](http://api.pgxn.org/mirror/extension/semver.json) to the [API `semver.json`](http://api.pgxn.org/extension/semver.json). * Adds an abstract for each distribution listed in the user's JSON file and all tag JSON files. Compare, for example, the [mirror `theory.json`](http://api.pgxn.org/mirror/user/theory.json) to the [API `theory.json`](http://api.pgxn.org/user/theory.json) and the [mirror `data types.json`](http://api.pgxn.org/mirror/tag/data%20types.json) to the [API `data types.json`](http://api.pgxn.org/tag/data%20types.json). This allows the [user page](http://pgxn.org/user/theory) and [tag pages](http://pgxn.org/tag/data%20types/) to include the abstract in the list of distributions released by the user or associated with a tag. * Adds records to a [Lucy](http://incubator.apache.org/lucy/)-powered full text search index. All of this merging stuff came out of my thinking following the discussion of the [PGXN API RFC](http://blog.pgxn.org/post/3099288750/pgxn-api-rfc). The decision to use [Lucy](http://incubator.apache.org/lucy/) instead of PostgreSQL's [full-text search](http://www.postgresql.org/docs/current/static/textsearch.html) followed rather naturally from this, as I quickly realized that there was no other driving need for a relational database behind the API at all. The *only* dynamic API is the [search API](https://github.com/pgxn/pgxn-api/wiki/search-api). Everything else is just static files. And given the [performance issues](http://www.depesz.com/index.php/2010/10/17/why-im-not-fan-of-tsearch-2/) of in-database search, as well as the desire to have fewer outside dependencies, made the decision a natural one. Beyond the syncing, there is a very simple web server providing the HTTP REST interface to the static JSON files and the full-text search. That's it, really. The API server is really just another mirror on steroids. The nice thing is that it allows an interface, such as [WWW::PGXN](http://search.cpan.org/perldoc?WWW::PGXN) or the new [PGXN client](http://blog.pgxn.org/post/5026314153/writing-a-client-for-pgxn) to work with either interface, just failing gracefully when API server APIs are unavailable. If you want to learn more about the specifics of the REST API, the [API documentation](https://github.com/pgxn/pgxn-api/wiki) has *all* the details. Really, it's quite comprehensive! I actually consider the API to be 1.0-complete at this point, unlike PGXN Manager. The only thing I want to add is [JSONP](http://en.wikipedia.org/wiki/JSONP) support for static JSON files (right now it's only for search results) and might tweak a few things here and there, but otherwise I think it's in pretty good shape. Longer term, though, it might be worthwhile to add some other features to enhance the value of PGXN overall. Some ideas: * Distribution and/or extension ratings (reviews, Like/Dislike, stars, or something). * Diffs to compare changes between versions. * A test reporting infrastructure with result matrices (á la [CPAN Testers](http://www.cpantesters.org/). But I think we need to build up some momentum on the foundation that's in place. Have you submitted your extensions, yet?