News and discussions
- Alexey Levan is our new Windows packager, and has put online Windows binaries of Darcs 2.5:
- We posted two roadmap blogs about Darcs 2.8:
Darcs 2.5 was released!
The report from the Orleans Sprint was posted:
We seem to be starting a tradition of sprints coinciding with social movements. Last year, the sprint venue in Vienna was squatted by students protesting university fee reforms. This time we were caught in the French pension reform strikes, which knocked out one of our would-be participants and made another lose a day.
The sprint was small but productive. We had four people attending, Florent Becker (also local organizer), Guillaume Hoffmann, Eric Kow and Reinier Lamers.
The day before the sprint, Eric gave a talk to undergraduate and masters students on Free and Open Source Software Projects, in particular the principles that we try to apply within the Darcs team.
Code quality - darcs code is a gem buried in big pile of muck. We've been making progress tidying the mess and moving towards a clean, well thought out library... but we still have a long way to go.
Portability - darcs relies on GHC, which takes a long time to build and which simply does not support certain niche platforms.
Usability - for all its power, Darcs has a reputation among its fans for being exceptionally easy to use. While we can be proud of our friendly UI and simple mental model, we need to also recognise the parts of Darcs that make life difficult. Here are three areas we should explore:
Patch annotations would allow Darcs to start tracking a repository history while still allowing for patch reordering. People should be able to ask questions like "who signed off on this patch?", and "when was in pulled into the current repository?"
Short version identifiers would make it easier for users to communicate with each other - "Bob, could you please fetch version 83dc9fa3?"
Network effects - Darcs is most useful when many other people are also using Darcs or something compatible.
Unlike Git/Hg/Bzr, Darcs lacks bridges with the other DVCSes. The other three can more or less talk to each other because they have similar models. Darcs is a bit different, so making good bridges can be tricky.
The Darcs community lacks services that facilitate collaboration to the same extent that Github does. We need patch-tag, darcsden and friends to get better still!
One piece of low-hanging fruit to pluck is the ability to host Darcs repositories on servers that lack a Darcs binary. How do we push patches over SFTP without the luxury of a remote Darcs binary?
We continued the discussion from the darcs-users mailing list on machine-readable formats for the Darcs data and command output. This discussion was tricky because it involves many different parts of Darcs and involves juggling some conflicting goals:
Reinier had the idea of bringing this discussion to a whiteboard -- this is why you need hackathons! -- which allowed us to take a more global view of the problem. After much discussion, we reached a conclusion was that we should converge on 4 formats:
Unified diff format [SAME], (familiarity, conservatism) there's no reason to move away from diff/patch style output for low-level diffs.
High level patch format [SAME] (conceptual integrity, agnosticism, conservatism) - this is a high-level representation of patches which is unique to Darcs. For instance, it can describe file renames and word replacement. We plan to continue using this format whenever we need to represent high-level patch contents.
Line-separated annotate format [NEW] (easy parsing, agnosticism, transparency) - We will deprecate the annotate --xml format, and shift to a line-based one. If there are community standards that exist we'll try to use them as much as possible.
Hashed context file format [NEW] (agnosticism, transparency, conceptual integrity) - we will deprecate the changes --xml format and converge to an extended version of the context file format. New features:
Note that where forced to choose, we have essentially sacrificed the otherwise worthy goals of standards compliance and extensibility.
Darcs 2.5 is almost here! The release was delayed for quality control reasons, but after many betas and bug fixes, we think we're ready to ship. Reinier put the finishing touches on our first release candidate.
Eric made a handful of improvements to the issue tracking infrastructure, improving integration with our darcs repository and darcswatch.
Eric and Reinier polished off some user interface work:
Guillaume documented much of Darcs pristine cache handling, fixing a darcs repair bug along the way. He
We want to make it as easy for people to use and host Darcs repositories. In particular, we think it would be great if you could host a Darcs repository on any server, without caring if Darcs is installed there or not. While it is already possible to fetch Darcs repositories from such server, what we now need is the ability to push to such repositories without a remote copy of Darcs.
Working in this direction, Florent implemented a long-requested feature for repositories without a working directory. This is useful for repositories which are only meant to be used for pushing/pulling, where the notion of a working directory is superfluous and makes some Darcs operations harder to implement.
Unfortunately, Benedikt could not join us for the sprint as travel from Zurich to Orleans was disrupted by strikes. Luckily, he was still able to participate over IRC. He ported over his work on the "patch index" optimisation to the latest version of the Darcs code in progress (that's a 6 month leap!) and will continue by exposing the patch index to Darcs commands.
Guillaume (right) with a question for Reinier
Our Darcs Weekly News editor attended his first sprint 6 months ago in Zurich, starting work on some ProbablyEasy bugs. It was great to see him again and very encouraging to see how much deeper he was getting into Darcs internals. Let's hear it from Guillaume:
I arrived at the sprint with this bug report in mind, written by a NetBSD user who could not build Darcs on his system. I wondered how easy could it be to write a minimal Darcs client that could only fetch a working copy from a Darcs repository, in a programming language more common than Haskell (Python comes to mind).
Thus began my discovery of the hashed repository format. The most susprising thing I discovered was the lack of documentation: currently someone who wants to write a Darcs client can only count on the existing source code. So I started to document what I understood by asking the other sprinters and looking at the code.
I also documented the Growing Pristine Problem as it was cited as being a low point of the hashed repository format with regards to the old-fashioned format. After understanding why this phenomenon happens, I believe that this is an unavoidable issue when one wants to avoid breakage during simultaneous pushing and getting the same repository. Also, it becomes a problem only in really big repositories.
However, some parts of Darcs could be improved. Darcs could do a better work to handle its pristine.hashed files. For instance, as of now, deleting the pristine.hashed directory leads to an almost dead-end situation since "darcs repair" refuses to work unless a dummy pristine.hashed directory is created. I sent a test case and a fix for this problem.
Missing pristine files are generally not handled graciously by Darcs while their presence is not necessary (albeit very important for speed). As of now, "darcs get" refuses to work when a pristine file is missing, and this has already bit me in the past. I proposed an enhancement of this behaviour. Other local commands that use the pristine files fail if one file is missing, but never tell the user to run darcs repair. I will probably work on these two proposals soon.
The aim is to make Darcs as robust as possible with its current format, and above all to prevent users from being exposed to unhelpful error messages.
The obligatory Jeanne D'arc statue photo
A special shout-out also goes to Yannick Parmentier, a LIFO researcher (and coincidentally Eric's former office mate) who very kindly visited us to take photos and shuttle us back and forth between Orleans and the lab. Merci, Yannick!
Reiner releases the release candidate 1 and the packager's preview 1 of Darcs 2.5:
Florent and Ganesh volunteered to be the next co-release managers:
Finally, the Darcs Hacking Sprint took place last week in Orleans. We will soon post a report.
Ganesh proposed a simple change in conflict marking:
The commiters of the Darcs team now use an additional public branch for sharing unreviewed or lightly reviewed work in progress:
Darcs 2.5 beta 5 was released:
Eric put online a series of videos (~1h) explaining the sate of darcs in 2010 (adapted from his recent talk in AngloHaskell 2010):
Petr released darcs-fastconvert, a tool that helps converting darcs and git repositories in both directions:
Jason asked for feedback about his proposition for darcs to uniquely identify the state of a repository by a hash:
Petr kicked off his ``adventure'' branch, a long-lived public branch of darcs he will use to make disruptive changes in the codebase:
Ganesh proposed to used a ``submitted'' public branch to help developers share they work and deal with the reviewing process more smoothly:
Darcs 2.5 beta 4 was released:
Ganesh put online his branch of darcs containing the ``rebase'' command for all to try. Simon Marlow gave feedback on this feature:
Joachim Breiner uploaded ipatch on hackage:
Darcs 2.5 beta 3 was released:
The next Darcs Hacking Sprint will take place at Orleans, France, the 15-17 of October:
Version 0.1.9 of darcs-benchmark was released, including bugfixes and experimental support for comparison with git and mercurial:
Adolfo Builes blogged about the end of his summer of code project. A summary of his project is available on the wiki:
darcs 2.5 beta 2 was released. Give it a try!
Reinier explained why beta 3 will have to wait a little:
Joachim "nomeata" Breitner released a ipatch, a tool based on darcs' hunk editing feature:
Two more blog posts from our Summer of Code students Adolfo and Alexey:
``darcs stash'' was discussed, with different possible implementations and UI proposed, and some example workflows evoked:
Reinier listed the release blockers for darcs 2.5:
Reinier announced the first beta of Darcs 2.5:
The next darcs sprint will happen in October in Orleans, France:
Reinier announced the release schedule of Darcs 2.5 (soft freeze July 8th, release August 7th):
Reiner also issued a call for volunteers for fixing unassigned bugs that should be fixed for the next release:
Eric explained how to make the patch reviewing process more efficient:
And the Sumer of Code blog posts of the last two weeks:
Lele Gaifax released a new version of the trac+darcs plugin:
We are still looking for a release manager for the release of darcs 2.5. Eric summarized the discussions concerning recruitment and the responsibilities and challenges related to this job:
Roadmap: rebase won't be in 2.5, annotate will be improved:
An alpha release of darcs 2.5 might happen soon:
Summer of Code: Alexey Levan sent a first version of patches for darcs optimize --http, and Adolfo Builes sent a patch fixing a bug concerning cache pool choice by darcs:
Darcs 2.4.4 was released this week:
Eric talked about the ongoing work in fixing the bug about non-ASCII filenames:
Adolfo wrote his first Google Summer of Code blog report:
Unexpectedly, we got a second student funded by the Google Summer of Code this year. Adolfo Builes will work on improving the global cache:
Darcs 2.4.4 is going to be released in one week if no bug is discovered in its current source (most importantly under Windows):
We are looking for a new release manager:
Eric announced the release of darcs 2.4.3, which fixes critical bugs under Windows and fixes the performance regression of darcs convert:
Simon Michael proposed a cleanup of the repository format names and gathered a few answers and proposals:
darcs 2.4.1 was released, fixing a couple of bugs of version 2.4.0. However, a serious bug under Windows was discovered, so Windows users should still stick to the 2.3.1 version, while 2.4.2 is not out:
Three students wrote their applications for this year's Summer of Code. Projects members discussed the priorities of darcs: network speed, local speed or UI ?
Lennart Kolmodin announced that darcs 2.4 was available in Gentoo Linux:
Mark Stosberg talked about darcs handling large repositories, then spawning a discussion on git vs darcs from the UI point of view:
If you haven't already read it on this blog, you can have a look at Eric's report from the last darcs hacking sprint:
In this sprint, we worked on finishing some performance work for the upcoming Darcs 2.5 release this summer (hashed storage, patch index, global caches, inventory hashing); planning our work for the Darcs 2.6 release next year (smart servers, cache cleanup, darcs rebase) and working with new users of the Darcs library.
We're always happy to work with new Darcs developers. At this sprint, we were joined by four new contributors.
Guillaume has been writing our Darcs Weekly News articles for a year now. Over the weekend he got his first taste of Darcs hacking, knocking out three ProbablyEasy bugs (darcs dist internals, darcs send -o UI, darcs apply with gzipped patch bundles). Guillaume reports that he can see himself doing more of this in the future!
Steven worked on a new feature to display the file contents hashed associated with any patch. This makes it easier for third party tools to inspect the patch files behind Darcs.
Stefan and David mostly worked on the Darcs Patch Manager, but to warm up, they tackled a couple of ProbablyEasy bugs, particularly a bug in darcs annotate that was affecting Redmine
Salvatore tracked down the Windows regression on 2.4 that make Darcs not work on windows shares.
Benedikt Schmidt continued his work on the patch index (formerly known as the filecache). The patch index keeps track of which patches affect which files. This index will bring a big boost to darcs annotate performance, particularly for files which are affected by relative small number of patches.
Luca continued his work on breaking up the global cache ($HOME/.darcs/cache) into buckets for faster access. Working with Reinier and Petr, Luca has developed an approach to migrating from old style caches to the new style bucketed ones. He has also improved the implementation to use hard links, to avoid disk space doubling and to preserve backwards compatibility with prior versions of Darcs.
Salvatore put together a nice Windows installer using the bamse package. It looks like we will be able to use this for the planned Darcs 2.5 release this summer. This work will also open the door to nicer integration with Windows tools, for example, using a bundled Tortoise SSH for better experience working with SSH passphrases.
Florent improved the quality of the Darcs cherry picking code, making it easier to fine tune our user interface and some day support graphical interfaces via the Darcs library. Witnessed list zippers for the win?
Florent also started work on adding Darcs's interactive cherry picking to darcs diff, making it possible to choose a set of patches to view as a diff.
Darcs has a representation of file and directory trees called slurpies. Petr polished off his work to replace the slurpies with his more efficient, general purpose hashed-storage library. Slurpies are going away, and Darcs will be faster for it. He and Ganesh also discussed how to gracefully transition from repositories created before the hashed-storage refactor.
Petr ported work by David Roundy to solve a scalability regression in hashed repositories. For darcs commands that write out patches, we had a naive hashing operation that does not account for the fact that patches behind tags cannot be modified. Darcs was unnecessarily traversing the entire sequence of patches (ie. O(n) time) when it could easily have been just traversing the sequence since the last tag.
Reinier continued to improve the encoding of Darcs patch metadata. Darcs is completely agnonstic with respect to the encoding of your files. Unfortunately, this agnostism extends to patch metadata (patch name, patch author), making it difficult for people to collaborate across different locales. To address this problem, Reinier has been working to make Darcs store its patch metadata in a single encoding (UTF-8) while gracefully supporting older patches (with metadata in potentially any encoding).
The Darcs 2.4 release was quite a tricky one to navigate. We found that bugs were only being flushed out on release candidate time and sometimes after the release proper.
We would like to encourage more people to try out Darcs work in progress and give us feedback early in the release process. After chatting about this, Reinier (with Ganesh, Eric and Petr) decided that as Release Manager, he would put out a Darcs alpha every 4 weeks.
In the future we may investigate automatic nightly builds via the buildbot and a platform support policy such as the one used by Tahoe.
Benedikt updated us on the recent status of his ongoing patch index work (formerly known as the filecache). We discussed the things that make the patch index convincing (permanant, repo-local, unique identifiers for files) the interaction between the patch index and the type witnesses and also ways of tuning the patch index performance and keeping it small.
We're looking forward to sharing the new patch index optimisation with you in upcoming releases. Darcs annotate may become a lot more useful in the next couple of releases!
Fast darcs annotate won't be useful if nobody can read it. Benedikt and Eric worked on designing a better output format darcs annotate. Taking a page from git blame, there will be one line per source file line, with columns for patch identifier, author name, date and finally the line. One of the design questions was how we should best refer to darcs patches, the current best candidate being a prefix of the darcs patch metadata hash.
Darcs get over networks is slow, painfully slow. Petr has suggested two priorities for improving the performance of network operations. The first would be to introduce a darcs optimize --http feature which would optimise the Darcs repository for fetching over a network (for example, by creating a "snapshot" of the pristine cache to be fetched in one go). The second priority would be develop a smart server that would provide darcs clients with only the files they need and in the optimal number of chunks. The two ideas combined would make an excellent Google Summer of Code project.
Prior to the sprint, Ganesh was working on a darcs rebase feature. Rebase will help Darcs users work with long term branches, and other cases where patch commutation by itself is not enough. At the sprint, Ganesh explained his work to everyone interested. Together we settled on a rough plan for the user interface. It looks like our new rebase command will offer a typically Darcs-ish twist: interactive cherry picking.
Ganesh and Florent talked with three teams building software in the Darcs ecosystem (DPM: Stephan Wehr and David Leuschner, Mac Darcs record GUI: Benedikt Huber and David Markvica, DarcsDen: Alex Suraci). There was a surprising degree of commonality.
The conversations have given us a much stronger sense of direction with the Darcs library. In particular, Ganesh is convinced that we should commit to our use witnesses - at the very least getting them completely finished so we can run with them, probably turning them on by default, and quite possibly dropping the non-witnesses builds.
We held a quick roundtable discussion to settle some decisions on Darcs default switches that have been hanging in the air. Our decisions for Darcs 2.5:
Petr and Benedkit gave lighting talks, showing some of our recent performance work to the Haskell community. Some exciting numbers from Benedikt's work (notes) include a 6 second darcs annotate on a file in the GHC repository (previously this did not complete within a half hour).
We discussed our priorities for this year's Google Summer of Code. We have decided that we would focus our attention on performance issues. If we had two GSoC students this year, we would be mainly interested in dividing them between
developing a smart server for much faster darcs get and pull over a network
performing a comprehensive overhaul of the Darcs hashed file cache handling
We also discussed ways to make the best use of our students' time. The Darcs team has participated in GSoC twice and learning a lot from the experience. This year we would like to see if we could publish some clear guidelines both on what we expect from GSoC students and what they can expect from us. Watch the mailing list for more discussion on this topic.
We were pleasantly suprised to find ourselves with users of the (still unstable) Darcs API. These new arrivals give us the feeling that the collection of related software is coalescing into a new Darcs ecosystem.
David Leuschner and Stefan Wehr worked on an exciting new patch management program for project maintainers. The Darcs Patch Manager (DPM) offers a new way for repository maintainers to keep track of incoming Darcs patches, including their amendements and dependencies.
$ dpm -r MAIN_REPO -s DPM_DB list
very cool feature [State: OPEN]
2481 Tue Mar 16 17:50:23 2010 Dave Devloper <email@example.com>
State: UNDECIDED, Reviewed: no
7861 Tue Mar 16 17:20:45 2010 Dave Devloper <firstname.lastname@example.org>
State: REJECTED, Reviewed: yes
marked as rejected: one minor bug
some other patch [State: OPEN]
7631 Tue Mar 16 13:15:20 2010 Eric E. <email@example.com>
State: REJECTED, Reviewed: yes
Towards the end of the hackathon, Stefan gave a nice short demo of DPM in action and deftly avoided the wrath of the demo Gods.
Benedikt Huber and David Markvica started work on a graphical interface to the Darcs record command. One key twist is that they make use of the Darcs API to get the kind of dependency-tracking interactiveness goodness that Darcs offers. Bendedikt and Huber report that they have spent most of the hackathon getting to grips with the library. Darcs type witnesses were very helpful for avoiding errors, but they also impose a steep learning curve.
Alex Suraci and Simon Michael made several improvements to Darcsden, an open source hosting solution (akin to Github and Patch-tag). Some recent changes were Atom feeds, the ability to view forks of your repository and cherry-pick patches from them (work in progress). Darcsden also makes use of the Darcs API.
The Darcs Team would like to hold hacking sprints twice a year. These sprints are an important occassion for us to hold design discussions, hack some code, train new Darcs hackers and generally bond as a team.
Do you think you can help? Please get in touch with me if you think you may be able to host a group of around 20 Darcs hackers one of these October or November weekends.
Getting over 75 Haskell hackers into Zürich and having them up and running on arrival (Swiss power plugs notwithstanding) was no easy task! We'd like to thank Johan Tibell, David Anderson and the rest of the Google Crew for their hard work organising this hackathon.
Thanks also to the generous donors who chipped into our 2010 Darcs Travel Fund. We'll be looking forward to using the leftover cash for the upcoming 5th Darcs Hacking Sprint in October or November.
Speaking of donors, we'd particularly like to thank the Software Freedom Conservancy for providing us with the infrastructure (both legal and technical) for accepting donations and holding assets such as the darcs.net domain. Meta projects like the SFC are crucial for the success of volunteer-driven open source projects such as Darcs.
Finally here are some words from happy Darcs hackers:
The sprint was a wonderful social occasion, and it was great meeting most of the Darcs hackers, and also seeing other Haskell hackers interested in working in the Darcs ecosystem. I especially enjoyed teaching them how to use our API. -- Florent
The atmosphere was wonderful and I consider the sprint to have been very productive overall. -- Petr
This is coolest thing I ever did -- Luca
See you in half a year!
We had ten Darcs hackers in Zürich along with four Haskellers using the Darcs API to do awesome things (plus two more on IRC).