Metalinks – Chrome Extension

Posted: April 7, 2012 in GSoC

One would be surprised to know that Google Chrome does not have extensions that support metalink downloads unlike Firefox. It could be attributed to the decreased granularity of control provided by the Chromium Extensions API in comparison with Firefox’s. However, the recent additions to the Chromium extensions API does provide most of the capabilities to support metalinks-based download. So, here is my post on how I developed a chrome extension in a couple of hours.

Let’s start verifying the requirements for a simple metalink downloader.

  1. Ability to distinguish metalinks from others
  2. Ability to save contents to a file in the system.
  3. Ability to parse the metalink file.

Looking at the requirements, it doesn’t look bad enough. Let’s try to handle them one by one. Firstly, ability to parse the metalink file is the easiest as you have DOM and XSLT at your disposal. With respect to saving the downloaded file to the system, luckily I was able to find chrome experimental downloads API. Now the method that could be used here is chrome.experimental.downloads.download. This takes in as parameters the file that needs to be downloaded and the output file name. For a detailed understanding of the various methods that the download API supports, do take a look at http://code.google.com/chrome/extensions/trunk/experimental.downloads.html.

Lastly, differentiating metalinks from other links should be easy considering metalinks have .metalink extension. So, if the extension has access to all the content in the current page, it can filter out the metalinks from other links easily.

Combining these ideas, I developed my first version of metalink downloader chrome extension. It can be seen in action here.

In essence, the extension does the following

  1. Detect all metalinks in the current tab – filtered based on <a> tag and .metalink extension
  2. Based on user selections, download the metalink file using XMLHttpRequest
  3. Parse the response from XHR and get the file to be downloaded and its source.
  4. Download the file from the most preferred source and save it.
  5. In case of errors, redownload the file.

Once I developed a basic version, I wanted to add more features suggested by Metalink founder, Anthony Bryan. So, I tried adding the ability to checksum the entire file and check for errors. In case of errors, the file gets re-downloaded from the next preferred mirror. The latest version of the extension can be downloaded from here :http://bit.ly/chrome-metalink-downloader.  If you find bugs with the code or have problems adding the extension to the chrome browser, do let me know.

I’ll try to do my best to get back to you.

Hmm.. To understand how metalinks work in detail, I wrote a java program that does the following:

  1. Takes as input a link to the metalink.
  2. Downloads the metalink file and extracts useful information from it.
  3. Select the most preferred mirror.
  4. Downloads the file pointed by the metalink from the selected source.
  5. Checks for errors in the file.
  6. In case of errors, selects the next preferred mirror.
  7. Go to Step 4 in case of errors. Otherwise, save the file and quit.

The metalink file has a lot of information including

  • File name
  • Description
  • File Size
  • Checksum of the entire file
  • Checksum of individual pieces along with piece parameters such as size(in bytes)
  • A list of sources, along with their types (such as ftp,http,torrent) that host the file
  • The sources also have a preference value that helps in selecting the mirrors.

Once the program extracts all this information using a parser such as Sax Parser, all we need to do is use the information obtained.

  1. Download the file from a mirror specified in the list of sources.
  2. Now for checksumming, use the MessageDigest Class of java.
  3. There are two things that can be done. Either checksum the entire file and see if the given checksum matches the one specified by the file. Or, you could checksum the individual packets and check if they match the piece information.
  4. If a particular piece is corrupted, you could download only that packet from another source or you could download the entire file from another mirror.

Thus, as long as one of the sources is available, one should be able to download the file without any corruption. In case of large files, one need not download the entire file in case of a corruption but instead download only those pieces that got corrupted.

This really got me started with how metalinks work. The source code for this project can be viewed here (http://bit.ly/java-metalink-downloader).

GSoC Proposal!

Posted: April 6, 2012 in GSoC

When I started looking for interesting GSoC projects for the Summer, I stumbled upon metalinks. Some quick questions that popped into my head. I slowly found the answers to these questions from various sources. Here is a quick look.

Metalinks?
Well, the term wasn’t new to me. I came across metalinks while downloading Open SUSE. Open SUSE, being really bloated, gives you four different ways of downloading the ISOs. People generally tend to go with torrents, as it supports error-resilience. Of course, metalinks offer the same or more benefits. However, there is the extra-hassle of finding download managers to download them.

What are Metalinks?
Metalinks are generally XML files containing various information about the file to be downloaded. Remember. it is not the file to be downloaded. It is more or less like meta-data. For example, the metalink for OpenSUSE contains information about the OpenSUSE image. The infomation helps in variety of ways such as checksumming, multi-sourced downloads, mirror priorities and so on.

Why are not metalinks not popular?
Despite Metalink’s numerous advantages, most browsers do not support metalinks natively. This, in turn leads to users depending on third-party plugins and download managers that they wouldn’t otherwise use. These add a lot of complexity to the download, contrasting to what metalinks stand for – Simplicity. It’s probably the main reason why metalinks are not popular with the masses. These issues are also highlighted by Anthony Bryan, Metalink contributor, in his feature-request here (http://bit.ly/feature-chrome) and here (http://bit.ly/feature-firefox). Thus, in order to make the concept popular, downloads easier, supporting metalink-based downloads in browsers has become a critical need.

What’s the way forward?
Well, browsers really really need to support metalinks natively. This would help in users downloading metalink-based files without even knowing what metalinks are. It would help metalinks becoming mainstream.

Thus, as it can be easily seen, my proposal dealt on browser support for Metalinks. Specifically, I wanted to implement metalink support in both chrome and firefox browsers. This (http://bit.ly/sundaram-gsoc-proposal) provides a detailed look of my proposal.

I’m looking forward to your comments.