Preparing Darcs 2.8Darcs 2.8 is fast approaching!
The main feature we're pushing for is a new "packs" optimisation, which rolls a repo’s pristine and patch files into single files, making darcs get over HTTP significantly faster. The packs optimisation work was done by Alexey Levan, one of our 2010 Gooogle Summer of Code students (and now our Windows packager). Guillaume spent some of the sprint working on a lot of the finishing touches needed to get packs into our users' hands: enabling them by default, writing new tests to ensure that Darcs attempts to use them, and reducing their size. Guillaume and Jérémie (who joined us as he was a local Darcs user and just wanted to give back, merci!) also worked together to get a couple of benchmarks:
|Jérémie's repository||~900 patches||10s||1s|
|darcs screened (full)||~9300 patches||37m||2m|
|darcs screened (lazy)||~9300 patches||27s||7s|
The timings and a couple of other numbers are now available on the darcs wiki optimize --http page
Future of Darcs
If Darcs didn't exist, it would be worth writing
Ganesh gave an excellent talk on the Future of Darcs, shaking us out of some recently infectious gloom (it was even getting to me!). He gave us a much needed reminder what we're fighting for, how to keep it going, and what he's been contributing to the fight.
- Darcs is important because it's fundamentally different. Our patch-based view is still unique and brings a lot of novel thinking to the version control table. We need to work on Darcs because nobody else is doing something like it. If Darcs itself didn't exist, we'd want to create it.
- Patch-based version control is the secret ingredient to Darcs' very easy user interface. The message we hear most often from Darcs fans is not about theory, but about how simple and easy it is to use Darcs. Our users don't care about patch theory per se, but it's because we work from patch theory that our user interface can be both gentle and powerful.
- Darcs provides a path to the future of version control. Ever wished your version control system thought in terms of abstract syntax trees instead of lines of text? We can't do that yet, but because we understand patches in a way nobody else does, we have a good idea for how to get there.
First, we still need to catch up; we're making a lot of progress overcoming performance issues, and usability issues such as our rather appalling conflict marking. Second, we need to solve the interoperability problem. Being a research-oriented version control system with a lot of practical catch-up to do puts us in the minority. We need to make it cheaper and cheaper for people who love Darcs to keep using it, when all their friends are using something else; and we need to make it risk free for somebody to give it a spin and see how they feel about it. Finally, we need to start moving towards that future, start tackling some of the really cool features we have in mind for Darcs 3.
So what has Ganesh been up to? Lots! When not taking care of a new baby, he's been implementing new features like rebase, making conflicts marking easier to use (with patch names!) and exploring a potential "graphictors" conflict representation (because every Darcs hacker needs a conflict representation of his own). More below!
|Thomas, Eric and Ganesh|
One of the great things about Darcs is that we generally do not need rebase; the combination of a patch theory and an friendly interactive UI makes many complicated rebase/cherry-pick use cases effortless for Darcs users. Theory and UI gets very far but sometimes it's not enough. Having a rebase operation would make it possible to do things like smoothing away unwanted conflicts, amending depended-upon patches including mistakes like that 1 GiB file you added one year ago, and consolidating multiple patches into one.
Ganesh gave us a tour of the changes and cleanups to Darcs code needed to make Darcs rebase work. He also brought up a tricky implementation issue about how the rebase "suspended" patches will interact with darcs amend-record, which we pored over together whiteboard markers in hand (this is why we need sprints!). Owen captured some of this discussion so we're hoping to have some nice diagrams explaining the issue in more detail.
Bridge - from Darcs to Git and everything else
Git is a great choice of version control system for many users and projects. Great hosting sites like GitHub, code review tools like Gerrit and graphical interfaces like GitTower add a lot of value to the Git universe and add testimony to the self-reinforcing power of network effects. So what kind of role does Darcs play in an increasingly Git-dominated world? We debated the issue for a while and eventually agreed on a single goal which is to work towards best-effort interoperability between Darcs and other version control systems, but being cautious to avoid tying ourselves tightly to any particular system.
|Owen on Day 1|
Owen and Ganesh worked on identifying the technical challenges behind creating such a bridge and fleshing it out into a Google Summer of Code proposal:
- Creating a mapping of multi-head repos to Darcs repositories.
- Import/Export of foreign patch formats.
- Efficiently mapping between patch-based and snapshot-based models.
- Robust translation between Darcs versions.
- Mapping from “Darcs only” patch-types e.g. replace.
Review backlogGetting together with a video projector was a good way to catch up on some patch review. Over the three days, Ganesh and Guillaume cleaned out the patch tracker, reviewing patches (6 accepted, 2 rejected, 1 follow-up requested). Thanks, guys!
|So if "Adam" submits a patch to the tracker...|
We spent some time thinking about how to achieve our goal of reforming the review process to accommodate the needs of patch submitters and reviewers, while keeping the overall process simple and lightweight. Our conclusion was to:
- preserve the screened/reviewed branch distinction and start explicitly calling the reviewed branch that
- simplify the general flow of patches to a linear screened, reviewed, release sequence (except the usual release-specific backports)
- eventually point http://darcs.net to the screened branch
- shift definitively from an amend-oriented culture to a follow-up oriented one. Once a patch has been accepted to screened, it can no longer be amended
- simplify release process by removing the notion of a soft freeze
Healing paper cutsOwen and Iago worked on improving Darcs' user interface in some corner cases. Owen studied the infamous "thisrepo" problem. The thisrepo cache entry is a piece of local-filesystem tracking information that causes more recent versions of Darcs to generate an annoying warning when a repository is moved to new location. Having established that it was safe to do so, he got Darcs to stop producing the entry and to ignore it when it is present. This keeps the warning system useful and relevant, by removing false alarms.
|Guillaume and Iago|
An interesting recurring theme came up talking about these issues with Owen, Iago and Ganesh, which is that changing Darcs behaviour can sometimes be a delicate affair because we have to take into account (a) repositories produced by older versions of Darcs and (b) how older versions of Darcs will react to repositories produced by more recent ones. It's the sort of issue that makes Darcs a great place for anybody who wants to confront "real world" software issues, where getting things right includes, and goes beyond, simply nailing down the theory.
Garbage collecting pristineThe darcs hashed repository format (one idea we've stolen from Git!) makes for much better robustness, allows for performance improvements such as the global cache and lazy fetching, and opens the door to future work on verifiability (short secure version identifiers). After a lot of performance work on hashed repositories, we've reached a point where we're ready to deprecate the old-fashioned format and get everybody updated to hashed repositories.
|Florent formulates a plan|
Guillaume led a discussion which led to a suggestion by Florent: keep track of pristine root hash timestamps and delete files older than 24 hours. Getting a repository should not take that long, so it should be safe to delete older files!
Darcs Testing and Code QualityIago gave a couple of nice presentations on work he has been doing for his MSc work, in the context of an MFES circular unit on formal methods: verifying darcs patch theory properties with the Alloy model checker, and assessing/improving the maintainability of Darcs code. He showed us some rather interesting examples of things that Darcs code does which makes testing hard (really long functions seems to be killer) and discussed improvements he made to our QuickCheck generators. It was great to see him break down our randomly generated tests into categories with varying degrees of meaningfulness and how a few simple tweaks to our generators could make the Darcs tests a lot more useful and informative. Iago will post the slides and his sample Alloy code when he has obtained his qualifications.
Darcs needs to work harder on code quality and testing, but what are you going to do with a handful of hobby-hacking hackers doing the best with their free time? Iago suggests candidates for the two lowest hanging fruit to pick: (a) cutting down our function sizes (check out urlThread in our URL module!) and (b) developing a sort of standard test suite template to be used for each Darcs module. He later spent a bit of time in the airport fleshing this out with the Darcs.Util module as an example.
Experience reportsWe had two new participants at the sprint, Owen and Iago. Let's hear it from them. Owen said:
The sprint was a great introduction to the Darcs team, having face-to-face discussions (and being able to quickly answer my beginner-questions) really helped bring me up to speed with not only the code-base but also the underlying concepts and ideas. My primary motivation for coming was to get to know the team and code-base, so that I could make an effective GSoC project proposal, and the sprint certainly helped in that respect.
Since I had only had limited experience with the code-base prior to the sprint, I didn't actually do much coding (I was primarily focusing on my GSoC ideas), however, at future sprints, I'll definitely be able to code more, having "learnt my way around".
I came away feeling very positive about Darcs, with a strong feeling of wanting to contribute, to bring Darcs to the top of the VCS world. It is obvious to me that people do care about Darcs!
I think attending a Darcs sprint is a great chance to have a fun weekend coding and talking about a project that is very much alive, with a few beers and a kebab afterwards! :)And Iago:
It was a very nice experience. Personally, I though the Sprint will consist on three days of almost full-time work from 9am to 6pm; now I see that the hard work is done during the time between Sprints, whilst Sprints are just time for discussing some project-management stuff, interesting work in progress, ideas, suggestions, etc. Although I have started contributing with Darcs few months before the Sprint, I think that it is a very good opportunity to meet Darcs people and start to contribute with Darcs.Thanks guys! It was great having you. I hope you can join us for future sprints.
- Ganesh on Future of Darcs
- Iago on Model Checking Darcs
- Iago on Darcs Code Quality
|It's on timer! (Iago, Guillaume, Owen, Ganesh and Eric)|
- Eric Kow
- Florent Becker
- Ganesh Sittampalam
- Guillaume Hoffmann
- Iago Abal
- Owen Stephens
- Thomas Refis
- Nicolas Pouillard
Thank-you!We had a great time in Paris. Many thanks to the Initiative de Recherche et Innovation sur le Logiciel Libre (IRILL) for making such nice facilities available to us with so little fuss!
Thanks also to our donors for making it sprints as accessible to a wide public, and to to the Software Freedom Conservancy for taking on the nitty gritty administrative detail that makes it possible for us to focus on the core mission of making Darcs rock.
Merci à tous!