Showing posts with label sprints. Show all posts
Showing posts with label sprints. Show all posts

Monday, January 18, 2016

darcs hacking sprint 10 report

Last weekend we had our tenth Darcs sprint, and our first one in Spain. This time indeed, the sprint was organised in the University of Seville, ETSII (Technical Superior School of Informatics Engineering), on January 15th to 17th.



We were 3 participants: Florent Becker, Guillaume Hoffmann and Ganesh Sittampalam. We also had Pierre-Étienne Meunier on video call and Simon Michael on IRC.


Darcs and Pijul integration


One major topic during the whole sprint was the possible integration of Darcs with Pijul. Pijul is a new version control system based on a different patch theory, whose main author is Pierre-Étienne Meunier. Florent also contributes to Pijul and announced its first release last October.

Pijul is promising in terms of how it handles conflicts in a better way than Darcs (better as: better presentation to the user and better performance). There may be a future where Darcs uses Pijul patches by default.  We had many conversations with Florent to understand the internals of Pijul and how it manages patches.

On the first day of the sprint we did a video call with Pierre-Étienne Meunier, to discuss integration of Pijul core with Darcs. It happens that the Darcs code is modular enough to handle Pijul patches (with some work). That afternoon Florent started to work on a Haskell binding for libpijul (through a C binding maintained by Pierre-Étienne, Pijul being implemented in Rust).

Ganesh, Florent and Pierre-Étienne are going to work towards a better integration of both systems. Pierre-Étienne plans to release a 0.2 version of Pijul soon.


Ganesh and Florent with Pierre-Étienne on video call

Renaming Patch/RealPatch to RepoPatchV1/RepoPatchV2

The code of Darcs contains many different layers of patch types. One of them is represented by the two types Patch and RealPatch, and specifies the behaviour of named patch when they are commuted and in case of conflicts. The "Patch" type is the behaviour of patches in repositories with the darcs-1 patch semantics (which can still be created by Darcs) and "RealPatch" is for darcs-2 semantics (the current default of Darcs). I sent a patch to rename these types into something less confusing: RepoPatchV1 and RepoPatchV2.


Interactive selection performance and refactoring

Even if we wrote a patch that improved greatly performance during the last sprint (and we now have a unit test for it), the command "darcs rollback -p ." still remains much slower than "darcs rollback" before presenting the first choice of patch to the user. Florent determined that this was because the action of matching patches within interactive selection is not lazy, ie, the whole list of patches has to be scanned and classified before the first prompt is shown to the user. Florent unearthed a refactor he had of the patch selection code and started rebasing it against the current code.


User manual and developer handbook

We want Darcs to have a user manual again, and a developer handbook that would compile documentation for programmers and computer scientists. We decided the manual should live in darcs' repository itself (so that it stays up-to-date) and the developer handbook on the wiki.

Darcs on Stackage

On IRC, Simon Michael (after an initial request by Joachim Breitner) committed himself to maintain a stack.yaml file for Darcs, and during the weekend Darcs was added to stackage for easier building.


Cleanup, fixes and refactorings

Ganesh tracked down bugs in rebase and sent a few cleanup patches. Moreover he's improving the code of "darcs test" (formerly called "darcs trackdown") so that uncompilable states are neither considered as Passing nor Failing, and bisect is going to be more efficient.


What happens next

I am going to release Darcs 2.10.3 within a couple of weeks, and Darcs 2.12 within a couple of months. This new major version will have optimizations (some of them are already backported to the 2.10 branch) and code refactorings. It may contain the stash feature currently developed by Ganesh. 

This year we hope to have another sprint, and to have more developers participating. Please consult the How to Help, Developer's Getting Started and Projects page on the wiki to get involved!

Ganesh, Florent and Guillaume



Thursday, September 24, 2015

darcs hacking sprint 9 report

After a one year and a half absence, the Darcs Hacking Sprint returned!

Once again, the event occurred at the IRILL (Innovation and Research Initiative for Free Software) in Paris, on September 18th to 20th.

The sprint had 7 participants: Danill Frumin, Eric Kow, Florent Becker, Ganesh Sittampalam, Guillaume Hoffmann, Thomas Miedema and Vinh Dang.

Darcs and GHC 8

Thomas Miedema is a Haskell and GHC hacker, and came on the first day of the sprint. Since Darcs is a system that aims at supporting the various GHC versions out there, Thomas helped us preparing for GHC 8, the next major version. He explained us one issue of GHC 8 that got triggered by Darcs: a bug with the PatternSynonyms extension. Fortunately it seems that the bug will be fixed in GHC HEAD. (First release candidate is planned for December).


Thomas explaining PatternSynonyms to Eric and Ganesh

Diving into SelectChanges and PatchChoices code

On the first day I (Guillaume) claimed the "rollback takes ages" bug, which made me look into SelectChanges and PatchChoices code. The result is that I still haven't yet fixed the bug, but I discovered that patch matching was unnecessarily strict, which I could fix easily. Internally, there are two interesting patch types when it comes to matching:
  • NamedPatch: represent the contents of a patch file in _darcs/patches, that is, its info and its contents
  • PatchInfoAnd: represents the info of a patch as read from an inventory file (from _darcs/inventories or _darcs/hashed_inventory) and a lazy field to its corresponding NamedPatch.
Now, getting the NamedPatch for some patch is then obviously more costly than a PatchInfoAnd. You may even have to download the patch file in order to read it (in the case of lazy repositories). Moreover,  the majority of matchers only need the patch info (or metadata), not its actual contents. Only two matchers (hunk and touch) need to actually read the patch file, while matching or a patch name for instance (probably the most common operation) does not.

So, before the sprint, as soon as you wanted to match on a patch file, you had to open (and maybe download) its file, even if this was useless. With my change (mostly in Darcs.Patch.Match) we gained a little more laziness; and the unreasonably slow command "rollback -p ." passes from 2 minutes to ~15 seconds on my laptop. I hope to push this change into Darcs 2.10.2.

Eric, Guillaume and Vinh

Now, the real source of the "rollback -p ." slowness is that patch selection is done on FL's (Forward List), while commands like rollback and obliterate naturally work backwards in time on RL. Currently, an RL is inverted and then given to the patch selection code, which is not convenient at all! Moreover, the actual representation of history of a Darcs repository is (close to being) an RL. So it seems like a proper fix for the bug is to generalize the patch selection code to also work on RL's; which may involve a good amount of typeclass'ing in the relevant modules. I think this will be too big/risky to port to the 2.10 branch, so it will wait for Darcs 2.12.

Ganesh's new not-yet-officially-named stash command


A few days before the sprint, Ganesh unveiled his "stash" branch. It feature a refactoring that enables to suspend patches (ie, put them into a state such that they have no effect in the working copy) but without changing their identity (which is currently what occurs with the darcs rebase command). This enables to implement a git-stash-like feature.

The sprinters (IRL and on IRC) discussed the possible name of the command that should encapsulate this stash feature. More importantly, on the last day we discussed what would be the actual UI of such a feature. As always when a new feature is coming to darcs, we want to make the UI as darcsy as possible :-)

Coming back to the code, Ganesh's refactoring, if extensive, will also simplify the existing types for suspended patches. We decided to go with it.

Dan's den


Dan demonstrating den (on the left: Florent)
Daniil Frumin was this years Google Summer of Code student for Darcs. Mentored by Ganesh, he brought improvements to Darcsden, many of them being already deployed. Among them, it is possible to launch a local instance of Darcsden (using an executable called den), not unlike Mercurial's "serve" command.

Dan tells more about his work and this sprint in his latest blog post.

A better website and documentation

As a newcomer to the project, Vinh took a look at the documentation, especially the website of the project. He implemented changes to make the front page less intimidating and more organized. He also had a fresh look at our "quickstart" and proposed improvements which we felt were much needed!

Florent's projects

For this sprint, Florent was more an external visitor than a Darcs hacker. He talked about one of his current projects: Pijul, a version control system with another approach. Check out their website!

Conclusion and the next sprint

In the end this sprint turned out to be more productive and crowded than we initially thought! It has been a lot of time since the previous one, so we had a lot of things to share at first. Sprints do make synchronization between contributors more effective. They are also a moment when we can get more concentrated on the Darcs codebase, and spend more time tacking some issue.

Avenue d'Italie, Paris
We would like to thank the IRILL people for hosting the sprint for the third time and our generous donators to make travelling to sprints easier.

We already have a time and a place for the next sprint: Sevilla, Spain in January 2016! The exact moment will be announced later, but you can already start organizing yourself and tell us if you're going.

Thomas, Eric and Ganesh
From left to right: Vinh, Florent, Dan, Ganesh and Eric

Thursday, February 21, 2013

darcs hacking sprint 8 report

The 8th Darcs Hacking Sprint happened on 15-17th February in Paris, at IRILL like in 2011. This sprint occured one week after the latest stable release (2.8.4), and after a process of integrating many new features to the HEAD repository of darcs: rebase, the patch index optimization, the last regrets prompt, and a lot of refactoring in the code base. We are currently looking at the next important milestone for darcs: the release of 2.10, that should happen sometimes this year. This sprint was about polishing as much as possible these new features and take a few short and medium-term decisions.

This time we had five people attending: Florent Becker, Ganesh Sittampalam, Guillaume Hoffmann, Owen Stephens and Radoslav Dorcik. We also had a short visit of Pierre-Yves David, a Mercurial hacker.


Sprinters' Backlog


Ganesh mainly worked on rebase in preparation for the upcoming 2.10 release: he resolved the issue 2282 and 2227, and started work on a `darcs rebase changes` command. He also made various minor code cleanups, including getting the unit tests back to green which will hopefully encourage people to run them in future.

Radoslav worked on a couple of ProbablyEasy bugs (darcs obliterate -O overwrites existing files and implement `darcs rebase unsuspend --summary`). He created a wiki page to help us think about a future darcs flags overhaul. Last day he wrote a prototype patch on the issue 
make darcs amend -A use the default author idwhich he will continue to work on after sprint along with elaboration of big overhaul of the darcs command line flags.


With Guillaume he also worked on the manual and completed the help for environment variables and the output of `darcs help markdown`. We decided that http://darcs.net/manual should always have the manual corresponding to the current stable branch of darcs (as of now 2.8), because sometimes commands and flags change. We thus continue the process of moving away the documentation from literate haskell and latex and to have everything in markdown (website documentation and darcs-generated help). Guillaume also fixed bugs in the testsuite, in particular network tests.

Florent worked around the UI/Selectchanges code, with three aims:

  • make the code clearer, and more easily usable by a gui / web ui (in progress)
  • allow the user to preview dependencies (done) and conflicts (todo) before selectenig a patch
  • adding darcs diff --interactive and darcs trackdown --interactive (interactive choice of a non-contiguous set of patches to search among.) unleashing the power of implicit branching! (in progress)
The current state of the branch is available on hub.darcs.net, and is subject to agressive rebases (as witnessed by the number of "brouillon" patches).


Owen spent the weekend of the sprint working on darcs-bridge. Spending time getting the theory right for exporting merges, including some tricky corner cases. He implemented a proof-of-concept of the new exporter and has started to integrate it back into the bridge.  With lots of help from Ganesh (thanks!), he discussed and worked through most of the difficult implementation points.



Last regrets for 2.10


We decided that 2.10 will contain the last regrets prompt (an extra final question "Do you want to push/pull these patches? [yn...]" ) in its current form.


Darcs on sshfs


We finally closed http://bugs.darcs.net/issue904 , accepting a patch that makes darcs work on sshfs-mounted directories. This, combined with the bare repositories introduced in 2.8, will make darcs easier to work with dumb servers, i.e., ssh-accessible servers which do not have darcs installed.


Google Summer of Code and 2.10 beta


We plan to apply to this year's Google Summer of Code as an independent organization. This means we will ask for two slots. We discussed possible projects. Organizations sumbissions for GSoC are at mid-march, so we estimated a beta release of Darcs 2.10 would make sense by then.



Mercurial's Changesets Evolutions


Pierre-Yves David, a Mercurial developer, came to see us and gave us a short version of his FOSDEM talk "Changesets evolutions with Mercurial". Changesets evolution is a feature that recently made its way into Mercurial, that enables automatic merging of rewritten histories. We discussed similarities and differences with the way darcs commutes patches and how `darcs rebase` works.


IRC Sprinters


We also counted with a good participation on #darcs. Eric Kow did intensive bug triaging on the tracker. Mark Stosberg worked on stabilizing two features: rebase and patch index. Iago Abal helped Ganesh improve the patch code unit tests. Simon Michael updated darcsden (the software behind hub.darcs.net) to the latest API of libdarcs, which changed before and during the sprint. Petr Rockai kept us up to date about the state of the buildbot infrastructure.


Meatspace sprinters

Thanks!


We would like to thank the generous people and organizations that made this sprint possible:
  • the IRILL for kindly hosting the sprint again.
  • all our donators, that help afford the prices of travelling and accomodations for the sprinters.




Sunday, April 8, 2012

darcs hacking sprint 7 report

This year, Darcs had a sprint in two phases.  It started with a one-day pre-sprint in Cordoba, Argentina (9 March), and then moved over to Southampton, England for a 3 day hackfest at the end of the month (30 March to 1 April).

No Darcs hackers being shipped between the two sprints though, but we did have one visitor from afar.   Potential GSoC student Bhimanavajjula Sri Rama Krishna (BSRK) Aditya flew the 9 hours between India and England to join us for the sprint.  It was great to meet him in person!

Presprint


We think that Darcs could make a great project for people get started with some practical Haskell hacking. It's a bit of a fixer-upper, but that also means there's a lot of difference to make!


Darcs veteran (and weekly news editor) Guillaume Hoffman was joined by two students, Miguel Pagano and Mathías Etcheverry, who within a day and with no prior knowledge of the Darcs code base were able to make the following contributions:
  • bringing the darcs backup filename conventions in line with CVS conventions, eg. ./foo.txt.~1~ rather than the unwieldy ./foo-darcs-backup0 [Miguel]
  • making the darcs-test harness respect the -fcurl cabal flag [Miguel]
  • investigating a wishlist item to print filenames under large hunks in darcs record, alas, not as “ProbablyEasy” as we'd expected [Mathías]
In between getting Miguel and Mathías, Guillaume also got a chance to make some improvements himself, namely:
  • adding the --unified flag to record, revert, amend-record
Thanks to Miguel and Mathías for joining us at the sprint. Hopefully we'll be able to repeat the cycle of Darcs hacking with them. And since little one-day mini sprints like the one Guillaume started are so easy to organise, there's a chance we'll be seeing more of these in the future.

Summer of Code


If he participates in this year's summer of code, Aditya will be helping us to integrate the long-promised patch index optimisation into Darcs. The patch index was originally developed by Benedikt Schmidt. It caches a mapping from filenames to the patches that affect those files, which saves a lot of work for commands like darcs changes or darcs annotate, commands that would otherwise have to trawl through the entire darcs history


Over the sprint, Aditya rebased the patch index code from Benedikt onto the current Darcs mainline.  He studied the code a bit to understand what exactly was behind the index, and started working on implementing the integration with commands like darcs changes.  He also got to explore a bit of Darcs internals, notably how Darcs makes use of matchers like 'date "before tea time"' to filter through patches.

One very concrete result of the sprint, we now have prototype of a patch-index-enabled darcs changes.If you can't wait to try it out, you could try applying the latest version of his patch.

Filepaths: bytes or code points?

Argh, Unicode, Argh

The main thing Ganesh worked on was fixing a problem with character set handling that has been outstanding for several months. The underlying problem was caused by recent versions of GHC changing the way it handles filenames on Linux; previously it treated them as a stream of raw bytes, but now it translates them into strings using an encoding. The eventual workaround was very short - explicitly set a global at the beginning of darcs, telling the GHC library to use no encoding at all - but it took a lot of investigation to get to that point, and the end result isn't very satisfactory for darcs as a library.

Darcs 2.8 Release Candidate 1


Florent and Ganesh also worked on getting a 2.8 release candidate ready.  We'd love any feedback you could give us on it, so if you're up for a little beta testing:

cabal update
cabal install darcs-beta

The character set handling problem with GHC 7.2/7.4 was the main blocker for a release, so hopefully we can get the real release out pretty soon now.


Can you duplicate a rotcilfnoc (inverse conflictor)?


We are painfully aware that our current version of patch theory is broken with respect to conflicts. Owen Stephens (from Summer of Code 2011!), who generously hosted the sprint (thanks, Owen!) spent a good chunk of Friday staring at one example of the brokenness, a failing QuickCheck test which he minimised to a simple 3-way conflict: create a directory and a file, (A) remove the directory, rename the file, (B) remove the directory, (C) move the file inside the directory under a different name.

Ouch.

After much discussion with Camp hacker Ian Lynagh, Owen discovered that this was just a fundamental bug in the conflictor-based approach. We've already been back to the drawing board for a while, but now we have yet another test case for what the next patch theory should deal with.


Next Patch Theory?


We spent a bit of the weekend working on and discussing the new patch theory.  Ian worked some more on Camp (more proofs!).  Ganesh explained a bit more what he had in mind with the graphictors ideas he was exploring (each conflicting patch would be in a minimal context with respect to the conflict).  And Owen talked us through some thinking he found digging through the archives of the old darcs-conflicts list. We know we need to a successor to the current version of patch theory.  But where will we end are we going to end up?

Clean clean clean that code


The new patch theory won't be for a while.  In the meantime, there is a ton of work we can do to prepare the ground for it.  One thing we can do to help is to improve the Darcs code base to the point where shifting to a new patch theory, or a new repository format, or a new set of primitive patches is relatively smooth and easy. Darcs needs a cleanup effort.

Owen, Ganesh and Eric made several pushes towards making the Darcs code more approachable:
  • Owen cleaned up a darcs module Darcs.Repository.HashedRepo
  • Owen and Ganesh made the darcs code base warnings free (at last! again)
  • Eric made use of Cabal 1.8's shared library feature so that Darcs only has to be built once rather than 3 times
  • Eric (and a little bit of sed) replaced the confusing type witness C preprocessor macros with some more straightforward Haskell
We have a very long way to go.  But we are thinking harder about the concrete steps we can take to making the Darcs code more respectable.

More helpful interactive mode


Florent worked on adding some more intelligence to the Darcs patch selection code.  Hopefully this work will lead to more feedback and some nice new features like an interactive darcs diff.  Cherry picking is one of the more unique aspects of Darcs, and one of the reasons we're so interested in making patch theory right one day.  The patch theory is what allows us to cherry picking in almost all of our commands.

But while interactive mode can be pretty helpful, it can also provoke for the kind of situation where good just makes you hungrier for better.  For example, if you try to pull some patches but interactively decide that you want to skip some patches, Darcs will also skip over the patches that depend on it.  But  figuring out why exactly patches get skipped can still be a bit mysterious.  What if instead of telling you it skipped some patches, Darcs could give name the dependencies you'd need to pull in too? Hopefully, Florent's investigations will pay off!



Rebase


Owen and Eric spent some time getting to know the new darcs rebase feature that Ganesh has been working on. It's nice!  Darcs rebase is for those situations where Darcs patch theory falls over (and fall over it does).  It allows us to rescue long-term branches previously lost to intractable conflicts, or to do “deep amend-record” operations that break through dependency barriers.

And this being Darcs, it's done with the interactive cherry-picking interface which should be familiar to users.  There's starting to be talk of getting this code in HEAD darcs so that people can try it out and we can start working towards refining the user interface.

Darcs Bridge


Darcs bridge isn't ready for prime time, we're afraid.  It's good for one-shot conversions, but if you're hoping to maintain a long-term bridge and you have to deal with Git branches, we'd advise waiting. But we're getting closer. Owen and Ganesh spent some time hashing out the design for the darcs bridge and thinking more about how the respective Darcs and Git models of the universe mesh together.

Where next?


Finally, among our many discussions was a more general question of strategy. Darcs is a very long term project and it could take many years for us to get the version control system that we want.  Over the past few years, we'd placed a great emphasis on performance, addressing some day to day issues to bring Darcs to a more acceptable place; and now the efforts are starting to pay off.  We now have faster local repository operations, repository fetching (mainly by deprecating the old fashioned format and getting people to switch to hashed repositories), and a much more usable darcs annotate command (in the upcoming 2.8 release).  So Darcs is faster now— it's certainly no Git and the conflict merging issue is still there, but it's in a much better place than it was 4 years ago.  Now what?

Now we start digging in for the long haul.  We have essentially 3 development priorities for the future of Darcs:

  • Cleanup: There is a massive amount of work to be done here, ranging from entry level tweaks like shifting to a uniform coding style and getting more disciplined about haddocks; to deeper software engineering issues, like developing a cleaner separation between repository-management and core patch theory code.  The code needs a lot of loving, and if you're ready to roll up your sleeves, we could use the help.
  • Hosting: Darcs isn't enough.  We need to think about online hosting and GUIs.  One of our goals is to have a Darcs library that makes it easier to write things like Patch-Tag, or Darcsden; or whatever interesting ideas the community may come up with.  If we have to, we may even prototype some code ourselves to push the library forward.
  • Theory: The one thing that we absolutely have to get right for the next patch theory is our story on conflicts. As you can see, we are thinking about quite a few different ideas. It's too early to tell which of these we'll end up running with.  More news when we have some more solid ideas.

Thanks!


Darcs is a long term project and with all the ups and downs we've been through over the years, we are grateful for the support the community has shown over the years.  Thanks to Guillaume and Owen for their sprint organisation efforts, to our donors for making it possible for students like Aditya to get to sprints, and the Software Freedom Conservancy for helping us with the administrative side of running an open source project.

If you'd like to support the Darcs team in our efforts to make an easy to use, flexible, formally backed version control system into a reality one day, we would be thrilled if you could submit patches, bug reports, comments on the IRC channel or darcs reddit.  If you just want to send a little cash our way to push sprints along, we most certainly appreciate your donations.

Until next time!

No sprint is complete without an Awkward Group Photo

Monday, April 18, 2011

darcs hacking sprint 6 report

The sixth Darcs Hacking Sprint took place on 1-3 April in Paris. As usual we had a lot of fun getting together, thinking and talking about Darcs for 3 days together, meeting new developers and seeing old friends.

Preparing Darcs 2.8

Darcs 2.8 is fast approaching!

The main feature we're pushing for is a new "packs" optimisation, which rolls a repo’s pristine and patch files into single files, making darcs get over HTTP significantly faster. The packs optimisation work was done by Alexey Levan, one of our 2010 Gooogle Summer of Code students (and now our Windows packager). Guillaume spent some of the sprint working on a lot of the finishing touches needed to get packs into our users' hands: enabling them by default, writing new tests to ensure that Darcs attempts to use them, and reducing their size. Guillaume and Jérémie (who joined us as he was a local Darcs user and just wanted to give back, merci!) also worked together to get a couple of benchmarks:

repositorypatchesbeforeafter
Jérémie's repository~900 patches10s1s
darcs screened (full)~9300 patches37m2m
darcs screened (lazy)~9300 patches27s7s

The timings and a couple of other numbers are now available on the darcs wiki optimize --http page

Future of Darcs

If Darcs didn't exist, it would be worth writing

Ganesh gave an excellent talk on the Future of Darcs, shaking us out of some recently infectious gloom (it was even getting to me!). He gave us a much needed reminder what we're fighting for, how to keep it going, and what he's been contributing to the fight.
  1. Darcs is important because it's fundamentally different. Our patch-based view is still unique and brings a lot of novel thinking to the version control table. We need to work on Darcs because nobody else is doing something like it. If Darcs itself didn't exist, we'd want to create it.
  2. Patch-based version control is the secret ingredient to Darcs' very easy user interface. The message we hear most often from Darcs fans is not about theory, but about how simple and easy it is to use Darcs. Our users don't care about patch theory per se, but it's because we work from patch theory that our user interface can be both gentle and powerful.
  3. Darcs provides a path to the future of version control. Ever wished your version control system thought in terms of abstract syntax trees instead of lines of text? We can't do that yet, but because we understand patches in a way nobody else does, we have a good idea for how to get there.
We use Darcs because we love the UI; we hack Darcs because we love the theory. But then what?

First, we still need to catch up; we're making a lot of progress overcoming performance issues, and usability issues such as our rather appalling conflict marking. Second, we need to solve the interoperability problem. Being a research-oriented version control system with a lot of practical catch-up to do puts us in the minority. We need to make it cheaper and cheaper for people who love Darcs to keep using it, when all their friends are using something else; and we need to make it risk free for somebody to give it a spin and see how they feel about it. Finally, we need to start moving towards that future, start tackling some of the really cool features we have in mind for Darcs 3.

So what has Ganesh been up to? Lots! When not taking care of a new baby, he's been implementing new features like rebase, making conflicts marking easier to use (with patch names!) and exploring a potential "graphictors" conflict representation (because every Darcs hacker needs a conflict representation of his own). More below!

Rebase Design

Thomas, Eric and Ganesh
Darcs rebase is a new feature which may be available in Darcs 2.8 on an experimental basis.

One of the great things about Darcs is that we generally do not need rebase; the combination of a patch theory and an friendly interactive UI makes many complicated rebase/cherry-pick use cases effortless for Darcs users. Theory and UI gets very far but sometimes it's not enough. Having a rebase operation would make it possible to do things like smoothing away unwanted conflicts, amending depended-upon patches including mistakes like that 1 GiB file you added one year ago, and consolidating multiple patches into one.


Ganesh gave us a tour of the changes and cleanups to Darcs code needed to make Darcs rebase work. He also brought up a tricky implementation issue about how the rebase "suspended" patches will interact with darcs amend-record, which we pored over together whiteboard markers in hand (this is why we need sprints!). Owen captured some of this discussion so we're hoping to have some nice diagrams explaining the issue in more detail.

Bridge - from Darcs to Git and everything else


Git is a great choice of version control system for many users and projects. Great hosting sites like GitHub, code review tools like Gerrit and graphical interfaces like GitTower add a lot of value to the Git universe and add testimony to the self-reinforcing power of network effects. So what kind of role does Darcs play in an increasingly Git-dominated world? We debated the issue for a while and eventually agreed on a single goal which is to work towards best-effort interoperability between Darcs and other version control systems, but being cautious to avoid tying ourselves tightly to any particular system.

Owen on Day 1
What we envisioned was an advanced bridge based on darcs-fastconvert that would allow us to maintain an incremental bidirectional mapping between repositories in Darcs and another version control system. We want it to be easy, for example to contribute to a GitHub hosted project as a Darcs user, or conversely to maintain a Darcs project, but make it easy for Git-using contributors to submit patches and track your upstream work. One exciting feature of the bridge is that it would allow us to smooth the progression from Darcs 1 to Darcs 2 repositories and eventually to Darcs 3.

Owen and Ganesh worked on identifying the technical challenges behind creating such a bridge and fleshing it out into a Google Summer of Code proposal:
  1. Creating a mapping of multi-head repos to Darcs repositories.
  2. Import/Export of foreign patch formats.
  3. Efficiently mapping between patch-based and snapshot-based models.
  4. Robust translation between Darcs versions.
  5. Mapping from “Darcs only” patch-types e.g. replace.
Check out Owen's proposal for more details. Looks like we may have a fun summer ahead of us!

Review backlog

Getting together with a video projector was a good way to catch up on some patch review. Over the three days, Ganesh and Guillaume cleaned out the patch tracker, reviewing patches (6 accepted, 2 rejected, 1 follow-up requested). Thanks, guys!

Development workflow


So if "Adam" submits a patch to the tracker... 
Speaking of the patch tracker, there's been lots of experimentation lately, namely, a "screened" branch and a more relaxed review policy. The overall aim is to reduce administrative overhead: the screened branch reduces the need for developers to maintain long term branches, and the relaxed review policy allows us to keep patches flowing, so that minor patches that don’t actually need any review are not jammed up by bureaucratic process. But change is messy! We now have users who aren't entirely sure how to send patches to Darcs or when they can or should amend their patches.

We spent some time thinking about how to achieve our goal of reforming the review process to accommodate the needs of patch submitters and reviewers, while keeping the overall process simple and lightweight. Our conclusion was to:
  1. preserve the screened/reviewed branch distinction and start explicitly calling the reviewed branch that
  2. simplify the general flow of patches to a linear screened, reviewed, release sequence (except the usual release-specific backports)
  3. eventually point http://darcs.net to the screened branch
  4. shift definitively from an amend-oriented culture to a follow-up oriented one. Once a patch has been accepted to screened, it can no longer be amended
  5. simplify release process by removing the notion of a soft freeze
Eric, Ganesh and Guillaume also spent some time on the developer documentation. We clarified which repository to send patches to (screened), when you can amend patches (only until they've been screened), and how to propose/add a new feature to Darcs (carefully). We also discussed briefly discussed the high level developer documentation concluding that we want to move towards it reading more like a book.

Healing paper cuts

Owen and Iago worked on improving Darcs' user interface in some corner cases. Owen studied the infamous "thisrepo" problem. The thisrepo cache entry is a piece of local-filesystem tracking information that causes more recent versions of Darcs to generate an annoying warning when a repository is moved to new location. Having established that it was safe to do so, he got Darcs to stop producing the entry and to ignore it when it is present. This keeps the warning system useful and relevant, by removing false alarms.

Guillaume and Iago
Iago started out by fixing a deceptively easy bug where darcs amend-record would let you add primitive patches to a tag patch. Darcs tags are implemented as empty named patches that depend on other patches; in theory tags could also contain patches of their own, but this use case is not supported in the implementation. Removing the ability to sneak these patches in avoids potential bugs and usability issues that may arise from it. Iago's work on this bug opened up into a host of related patch name parsing corner cases. Iago spent some time analysing the different possible ways to specify the name of a patch, how they behave with respect to empty names and names starting with 'TAG'. This took quite a bit reading the Darcs.Commands.Record code to figure out how to could effectively fix these problems with minor effort.

An interesting recurring theme came up talking about these issues with Owen, Iago and Ganesh, which is that changing Darcs behaviour can sometimes be a delicate affair because we have to take into account (a) repositories produced by older versions of Darcs and (b) how older versions of Darcs will react to repositories produced by more recent ones. It's the sort of issue that makes Darcs a great place for anybody who wants to confront "real world" software issues, where getting things right includes, and goes beyond, simply nailing down the theory.

Garbage collecting pristine

The darcs hashed repository format (one idea we've stolen from Git!) makes for much better robustness, allows for performance improvements such as the global cache and lazy fetching, and opens the door to future work on verifiability (short secure version identifiers). After a lot of performance work on hashed repositories, we've reached a point where we're ready to deprecate the old-fashioned format and get everybody updated to hashed repositories.

Florent formulates a plan
But there's a lingering performance issue in the back of our minds. To avoid race conditions we have disable automatic garbage collection of ununsed hashed repository entries. Repository owners would have to remember to run an occasional darcs optimize to delete these files. Is there a better way?

Guillaume led a discussion which led to a suggestion by Florent: keep track of pristine root hash timestamps and delete files older than 24 hours. Getting a repository should not take that long, so it should be safe to delete older files!

Darcs Testing and Code Quality

Iago gave a couple of nice presentations on work he has been doing for his MSc work, in the context of an MFES circular unit on formal methods: verifying darcs patch theory properties with the Alloy model checker, and assessing/improving the maintainability of Darcs code. He showed us some rather interesting examples of things that Darcs code does which makes testing hard (really long functions seems to be killer) and discussed improvements he made to our QuickCheck generators. It was great to see him break down our randomly generated tests into categories with varying degrees of meaningfulness and how a few simple tweaks to our generators could make the Darcs tests a lot more useful and informative. Iago will post the slides and his sample Alloy code when he has obtained his qualifications.

Darcs needs to work harder on code quality and testing, but what are you going to do with a handful of hobby-hacking hackers doing the best with their free time? Iago suggests candidates for the two lowest hanging fruit to pick: (a) cutting down our function sizes (check out urlThread in our URL module!) and (b) developing a sort of standard test suite template to be used for each Darcs module. He later spent a bit of time in the airport fleshing this out with the Darcs.Util module as an example.

Experience reports

We had two new participants at the sprint, Owen and Iago.  Let's hear it from them. Owen said:
The sprint was a great introduction to the Darcs team, having face-to-face discussions (and being able to quickly answer my beginner-questions) really helped bring me up to speed with not only the code-base but also the underlying concepts and ideas. My primary motivation for coming was to get to know the team and code-base, so that I could make an effective GSoC project proposal, and the sprint certainly helped in that respect. 
Since I had only had limited experience with the code-base prior to the sprint, I didn't actually do much coding (I was primarily focusing on my GSoC ideas), however, at future sprints, I'll definitely be able to code more, having "learnt my way around".
I came away feeling very positive about Darcs, with a strong feeling of wanting to contribute, to bring Darcs to the top of the VCS world. It is obvious to me that people do care about Darcs! 
I think attending a Darcs sprint is a great chance to have a fun weekend coding and talking about a project that is very much alive, with a few beers and a kebab afterwards! :)
And Iago:
It was a very nice experience. Personally, I though the Sprint will consist on three days of almost full-time work from 9am to 6pm; now I see that the hard work is done during the time between Sprints, whilst Sprints are just time for discussing some project-management stuff, interesting work in progress, ideas, suggestions, etc. Although I have started contributing with Darcs few months before the Sprint, I think that it is a very good opportunity to meet Darcs people and start to contribute with Darcs.
Thanks guys! It was great having you. I hope you can join us for future sprints.

Presentations

  • Ganesh on Future of Darcs
  • Iago on Model Checking Darcs
  • Iago on Darcs Code Quality

Participants 

It's on timer! (Iago, Guillaume, Owen, Ganesh and Eric)
  • Eric Kow
  • Florent Becker
  • Ganesh Sittampalam
  • Guillaume Hoffmann
  • Iago Abal
  • Owen Stephens
Short visits
  • Jérémie
  • Thomas Refis
  • Nicolas Pouillard

Thank-you!

We had a great time in Paris. Many thanks to the Initiative de Recherche et Innovation sur le Logiciel Libre (IRILL) for making such nice facilities available to us with so little fuss!

Thanks also to our donors for making it sprints as accessible to a wide public, and to to the Software Freedom Conservancy for taking on the nitty gritty administrative detail that makes it possible for us to focus on the core mission of making Darcs rock.

Merci à tous!

Thursday, October 28, 2010

darcs hacking sprint 5 report

The fifth Darcs hacking sprint took place in Orleans over the weekend of 15-17 October.

We seem to be starting a tradition of sprints coinciding with social movements. Last year, the sprint venue in Vienna was squatted by students protesting university fee reforms. This time we were caught in the French pension reform strikes, which knocked out one of our would-be participants and made another lose a day.

The sprint was small but productive. We had four people attending, Florent Becker (also local organizer), Guillaume Hoffmann, Eric Kow and Reinier Lamers.

Talks and discussions

Scribble Scribble Think Think!

Maintaining Darcs

The day before the sprint, Eric gave a talk to undergraduate and masters students on Free and Open Source Software Projects, in particular the principles that we try to apply within the Darcs team.

Therapy session

We kicked off the sprint with a discussion of some of the challenges that we have been facing in the Darcs community and how we hope to rise to them over the long term.

  1. Code quality - darcs code is a gem buried in big pile of muck. We've been making progress tidying the mess and moving towards a clean, well thought out library... but we still have a long way to go.

  2. Portability - darcs relies on GHC, which takes a long time to build and which simply does not support certain niche platforms.

  3. Usability - for all its power, Darcs has a reputation among its fans for being exceptionally easy to use. While we can be proud of our friendly UI and simple mental model, we need to also recognise the parts of Darcs that make life difficult. Here are three areas we should explore:

    • Patch annotations would allow Darcs to start tracking a repository history while still allowing for patch reordering. People should be able to ask questions like "who signed off on this patch?", and "when was in pulled into the current repository?"

    • Short version identifiers would make it easier for users to communicate with each other - "Bob, could you please fetch version 83dc9fa3?"

    • Better conflict marking and summaries would help maintainers to merge sets of patches from long term branches.

  4. Network effects - Darcs is most useful when many other people are also using Darcs or something compatible.

    • Unlike Git/Hg/Bzr, Darcs lacks bridges with the other DVCSes. The other three can more or less talk to each other because they have similar models. Darcs is a bit different, so making good bridges can be tricky.

    • The Darcs community lacks services that facilitate collaboration to the same extent that Github does. We need patch-tag, darcsden and friends to get better still!

    • One piece of low-hanging fruit to pluck is the ability to host Darcs repositories on servers that lack a Darcs binary. How do we push patches over SFTP without the luxury of a remote Darcs binary?

File formats

We continued the discussion from the darcs-users mailing list on machine-readable formats for the Darcs data and command output. This discussion was tricky because it involves many different parts of Darcs and involves juggling some conflicting goals:

  • Standards compliance: preference for well-documented and understood standards.
  • Familiarity: attention to de-facto standards used in the revision control community.
  • Conceptual integrity: all Darcs formats should work the same way
  • Extensibility: ability accomodate future requirements
  • Easy parsing: people should be able to whip up quick little scripts to slice and dice Darcs output
  • Agnosticism: need a good story for arbitrary bytes because Darcs is fairly agnostic to the content of text files
  • Transparency: no complicated escaping mechanisms as these tend to fall in rare corner cases and would be easy to get wrong.
  • Conservatism: the less we change Darcs, the better

Reinier had the idea of bringing this discussion to a whiteboard -- this is why you need hackathons! -- which allowed us to take a more global view of the problem. After much discussion, we reached a conclusion was that we should converge on 4 formats:

  1. Unified diff format [SAME], (familiarity, conservatism) there's no reason to move away from diff/patch style output for low-level diffs.

  2. High level patch format [SAME] (conceptual integrity, agnosticism, conservatism) - this is a high-level representation of patches which is unique to Darcs. For instance, it can describe file renames and word replacement. We plan to continue using this format whenever we need to represent high-level patch contents.

  3. Line-separated annotate format [NEW] (easy parsing, agnosticism, transparency) - We will deprecate the annotate --xml format, and shift to a line-based one. If there are community standards that exist we'll try to use them as much as possible.

  4. Hashed context file format [NEW] (agnosticism, transparency, conceptual integrity) - we will deprecate the changes --xml format and converge to an extended version of the context file format. New features:

    • file contents hashes (issue1550)
    • format version information
    • is-context-file flag (need deps to be safe to use)

Note that where forced to choose, we have essentially sacrificed the otherwise worthy goals of standards compliance and extensibility.

Hacking

Reinier and Guillaume hacking away

Darcs 2.5

Darcs 2.5 is almost here! The release was delayed for quality control reasons, but after many betas and bug fixes, we think we're ready to ship. Reinier put the finishing touches on our first release candidate.

Infrastructure

Eric made a handful of improvements to the issue tracking infrastructure, improving integration with our darcs repository and darcswatch.

User interface

Eric and Reinier polished off some user interface work:

  • Removed a confirmation prompt asking you if you really want to record your patch when you choose to edit long comment but make no changes. (undo beats confirmation).
  • Testing UI regression fix by Adolfo: Darcs was overzealous in warning of about unreachable cache entries.
  • Improved checking of commands that work on file paths

Pristine cache handling

Guillaume documented much of Darcs pristine cache handling, fixing a darcs repair bug along the way. He

  1. Studied the problem of garbage collecting the Darcs pristine cache http://wiki.darcs.net/Using/GrowingPristineProblem
  2. Fixed darcs repair when pristine cache is missing.
  3. Designed some improvements to darcs get handling of missing pristine cache items. http://bugs.darcs.net/issue1976

No working directory: towards passive repositories

We want to make it as easy for people to use and host Darcs repositories. In particular, we think it would be great if you could host a Darcs repository on any server, without caring if Darcs is installed there or not. While it is already possible to fetch Darcs repositories from such server, what we now need is the ability to push to such repositories without a remote copy of Darcs.

Working in this direction, Florent implemented a long-requested feature for repositories without a working directory. This is useful for repositories which are only meant to be used for pushing/pulling, where the notion of a working directory is superfluous and makes some Darcs operations harder to implement.

Faster annotate (we'll get there!)

Unfortunately, Benedikt could not join us for the sprint as travel from Zurich to Orleans was disrupted by strikes. Luckily, he was still able to participate over IRC. He ported over his work on the "patch index" optimisation to the latest version of the Darcs code in progress (that's a 6 month leap!) and will continue by exposing the patch index to Darcs commands.

Experience report

Guillaume (right) with a question for Reinier

Our Darcs Weekly News editor attended his first sprint 6 months ago in Zurich, starting work on some ProbablyEasy bugs. It was great to see him again and very encouraging to see how much deeper he was getting into Darcs internals. Let's hear it from Guillaume:

I arrived at the sprint with this bug report in mind, written by a NetBSD user who could not build Darcs on his system. I wondered how easy could it be to write a minimal Darcs client that could only fetch a working copy from a Darcs repository, in a programming language more common than Haskell (Python comes to mind).

Thus began my discovery of the hashed repository format. The most susprising thing I discovered was the lack of documentation: currently someone who wants to write a Darcs client can only count on the existing source code. So I started to document what I understood by asking the other sprinters and looking at the code.

I also documented the Growing Pristine Problem as it was cited as being a low point of the hashed repository format with regards to the old-fashioned format. After understanding why this phenomenon happens, I believe that this is an unavoidable issue when one wants to avoid breakage during simultaneous pushing and getting the same repository. Also, it becomes a problem only in really big repositories.

However, some parts of Darcs could be improved. Darcs could do a better work to handle its pristine.hashed files. For instance, as of now, deleting the pristine.hashed directory leads to an almost dead-end situation since "darcs repair" refuses to work unless a dummy pristine.hashed directory is created. I sent a test case and a fix for this problem.

Missing pristine files are generally not handled graciously by Darcs while their presence is not necessary (albeit very important for speed). As of now, "darcs get" refuses to work when a pristine file is missing, and this has already bit me in the past. I proposed an enhancement of this behaviour. Other local commands that use the pristine files fail if one file is missing, but never tell the user to run darcs repair. I will probably work on these two proposals soon.

The aim is to make Darcs as robust as possible with its current format, and above all to prevent users from being exposed to unhelpful error messages.

Thanks

Thanks to Florent and to the rest of the laboratoire LIFO for hosting the Darcs team this weekend! Hosting sprints is an excellent way to support and to interact with the Darcs community.

The obligatory Jeanne D'arc statue photo

A special shout-out also goes to Yannick Parmentier, a LIFO researcher (and coincidentally Eric's former office mate) who very kindly visited us to take photos and shuttle us back and forth between Orleans and the lab. Merci, Yannick!

Merci, Yannick!

See you next time!

This was a really fun sprint. We hope you can join us next time, hopefully in 6 or so months. In the meantime, check out the flickr tag darcs-2010-10 for more photos from the sprint.

Followers