Two weeks ago the Free Software Foundation announced a review of their High Priority Projects list in which the FSF is asking for input from the public. Per the request, there are a number of projects that I feel “are important for increasing the adoption and use of free software and free software operating systems“, which I’ll be discussing here. Along with those, there are a number of other projects I’ve been thinking of over the last couple years that don’t fit into the free software category per se, but would be societally beneficial nonetheless.
As well as providing input for the FSF’s HPP list, the list of projects here serves as a record of items I would work on given enough time. Since I probably won’t be able to work on all of them, it seems best if I just publish the list, both so that people can think about projects new to them or in new ways, and so that others can see reasons to work on important projects and hopefully make some of these dreams a reality.
I’ll start with the projects that I feel are important for free software and then add additional projects I feel are more generally important below that. They are ordered from highest to lowest priority (in my opinion), though the order in which I’ve chosen to work on them differs since I’ve factored in my own qualifications and resource constraints (you’ll see I’ve started working on #3 first). Each section could (and perhaps should) constitute a blog post in itself, so I apologize for both the length of this post and for the brevity of each section. I would appreciate if people could comment on whether any specific sections should be explained more fully.
Among the free software projects I mention below are a free software laptop, free software cell phone, free software multi-platform chat with PSTN/SMS federation, and simple data replication and retrieval API for hosting. After discussing those, I briefly describe why Gnash is no longer needed before moving onto the additional privacy-promoting projects that are less directly related to free software. These are anonymous low-cost online micropayments, widely-supported SLAs for dedicated channels and/or DiffServ, and a cell network without tracking. Wrapping it up are some closing thoughts.
1. Free software laptop
There is currently no good laptop that one can buy preinstalled with a free software operating system. Attempts have been made, but usually the laptop falls short in key areas, which give it a competitive disadvantage: it is too heavy, the battery life is too short, there are suspend/resume issues, and/or the wifi connection is flaky. It is important for the free software movement to have a laptop one can recommend that is usable but yet doesn’t compromise on the free software ideals. This is possible, but hasn’t been done yet.
Ideally, we would have a brick and mortar store where people could try out this free software laptop and see other RYF products as well. This would provide additional legitimacy to the movement, as it shows we are investing in making the software (and associated hardware) accessible to people. Probably a flagship store in one or two large cities would be best for this, as there likely wouldn’t be much funding available initially. Eventually, as the laptop becomes more popular, it’s possible that we could start defining features to the hardware manufacturers, since the volume we order would grow to the point where they would pay attention. This is very helpful as it would allow us to pick features that further the free software mission and, perhaps more importantly, may allow us to define features in other products of interest, such as cell phones.
2. Free software cell phone
Related to the free software laptop, and nearly as important (perhaps more important for the general population), is having a free software cell phone. (To achieve more general appeal, beyond free software hackers, it would likely need to be a smartphone.) As far as I’m aware, a self-contained free software cell phone does not currently exist. This is mostly because of the baseband modem found in all cell phones, which almost always runs non-free software (though it is not legally required to, in the US at least). Without the baseband, the cell phone cannot act as a cell phone.
The only phones I’m aware of that have a free software baseband available to them are those that osmocomBB supports. However, in nearly all cases, a separate “host” device (such as a laptop) is required to do some of the processing, clearly not an ideal situation for a pocketable device. In the one case where it is not, the ability to receive phone calls is severely impaired.
I can understand why the FSF has not contributed much, if any, of its resources to such a project, as cell phones are intrinsically dangerous devices to use (more on that below). It is a recurring theme in talks by FSF’s president, Richard Stallman, that cell phones track you and allow the cell phone company and often the government to know where you are at all times, which is why Stallman himself doesn’t use one. That is a reasonable approach, but not one that most people are willing to live with, including almost all free software developers I know (there is perhaps one developer aside from Stallman I know who does not use a cell phone).
Since it seems we are not able to convince even the most stalwart defenders of free software to get rid of their cell phone, it is thus immensely important that the cell phones they use run only free software. Aside from the obvious benefits of freedom for cell phone users, including free software developers, it also allows us to avoid the inherent and currently prevalent hypocrisy that arises from using non-free software on a daily basis.
I believe that a free software baseband would also boost interest in and usage of projects aimed at create a free software operating system for cell phones, such as Replicant. As it stands, developers of such operating systems are currently ceasing their free software efforts at the baseband boundary, leaving part of the phone’s software non-free. This may cause users and developers of the OS to become discouraged, as their efforts to use only free software have ultimately failed due to the non-free baseband. With a free baseband, all software in the phone could be free, allowing developers and users to run only free software and still have a cell phone, if they so choose.
3. Free software multi-platform chat with PSTN/SMS federation
Disclaimer: I am working on this project myself right now (more on that below).
Related to the free software cell phone, this would be another fully free software solution for communicating with people via voice and SMS who are using regular phones (ie. on the PSTN). This project would not require a phone, though it could use one if the user wanted to. It would use a VoIP provider for voice calls (likely over SIP) and a similar provider, but with SMS support, for delivering and receiving SMS. This would allow people to communicate by both voice (on the PSTN) and SMS without needing to use any non-free software at all.
In particular, this would be a free software replacement for Google Voice as well as a free software replacement for Apple’s Continuity (my apologies that Wikipedia doesn’t have an article on this yet), specifically the phone and SMS parts of it. Similar services include MightyText, Pushbullet, and PPL Connect. These proprietary options differ in featureset, but do have some commonalities: they all allow one to send and receive SMS from a computer or tablet.
The features of Google Voice are especially interesting to someone who wishes to make a free software replacement, as it is the only one of the above proprietary options that does not require one have a cell phone at all in order to receive voice calls and SMS (using a cell phone is problematic for the reasons outlined in the free software cell phone section).
The free software replacement would behave like a normal chat client (ie. any XMPP client), but would additionally have a backend that would connect it with the PSTN and SMS networks using a VoIP provider. When both people are using the free software replacement, it could transmit the voice and SMS messages over the Internet or a private network instead of using the telephone network (and could encrypt them if desired). Since real-time communication can be tricky over the Internet, a “dedicated channels” system as described below would help make this option as reliable as PSTN, but it is not necessary for the initial implementation.
Since it appears to me that the VoIP provider would be a “communication service” in this situation, there are no SaaSS issues with this approach. Please comment or send me a message if you believe this to be incorrect.
As mentioned above, I’m actively working on this project, and I have a prototype that performs the SMS portion (though it delivers the SMS via SMS instead of to a chat client so far). If you’re curious how it’s going, please let me know and I can give you an update. It’s not ready to be widely publicized yet, but I’m happy to discuss it individually until then.
4. Simple data replication and retrieval API for hosting
Currently it’s hard for a person to setup their own redundant data storage across multiple physical locations. But it’s easy for them to use someone else’s redundant storage with no monetary remuneration, as is the case with Gmail, Flickr, iCloud, and Dropbox, to name a few. Of course, one pays for these services in non-monetary ways, and there are lots of other issues, not the least of which are the jurisdictional problems that arise from putting your data on a company’s server which is in a country that is not your own.
Ideally all of the services that we use would themselves be redundant (run on multiple servers with seamless failover). However, making this a reality is highly service-dependent (one may be able to replace one web server with another when one fails, due to their relative statelessness, but an XMPP server is not as easily replaced because there are connections and state to retain). As a result, I feel it’s important to focus on one layer of the stack where most of the jurisdictional and other issues arise, a layer that nearly all services rely on, which can be somewhat easily abstracted from the other layers: data storage.
The benefit of this approach is that all of a user’s data could be stored in physical locations that the user has vetted, such as their own home, and/or the homes of their neighbours or friends. To achieve appropriate redundancy, a user would likely want to keep their data in multiple physical locations, so relying on friends or family to host a copy of their data may be advantageous. The data could be encrypted such that the people hosting data for a given user would not be able to access it, though this does add additional complexity at retrieval time. Additionally, users may wish to pay a commercial hosting service to host a copy of their data, in which case encryption becomes essential.
This storage approach is similar in style and intent to that of FreedomBox, though the emphasis here is on having one’s data in multiple locations, rather than just one. The hardware being developed for FreedomBox would be quite helpful, though, as it would provide cheap computers (attached to storage) that people could install in multiple physical locations. Part of the reason for this is that residential ISPs are simply not fast enough or reliable enough to transmit data at the speeds that most users expect from Gmail, Flickr, and Dropbox. Some of the “dedicated channels” worked discussed in a later section may help with this, but ultimately we need more competition in the residential ISP space.
There are a variety of filesystems that will take care of the redundant storage aspect, such as Ceph and Tahoe-LAFS. These would be very useful as a backend for a redundant data storage API, but are not sufficient by themselves, in part because it can be difficult to setup a filesystem using only a single node (this has no redundancy, but is no worse than most servers today), which is an important use case as it is serves as a helpful starting point for testing, and in part because setup and configuration of these filesystems is too complex.
Configuring a new node should be as simple as:
- Login to remote machine that you’d like to add as a storage node, via SSH or similar
- Run a script to install the API hosting services, which takes two parameters (hopefully no more, as we’re trying to keep it simple): the name of the existing or new storage volume (which may include a list of existing nodes) and the directory on this machine to use for storing files in this volume
Once the above are completed, the new node would start copying files from the other nodes to create a new fully-replicated endpoint, optionally using a bandwidth limit to avoid interrupting other services on the same Internet connection (see below for more on that).
(My description above of configuring a new node is obviously too complex for new users, but we could achieve a similar end result by creating a preconfigured box that users could buy and attach to their storage volume using a web UI or similar.)
The user whose data is being stored could then query the storage volume using its name to see how many servers are currently operational, how many copies have been made (how redundant it is), and how healthy each copy is (are there bad sectors or other read errors?). This could be done through a shell command and hopefully a web interface.
Note that the name of the storage volume itself might be complex. Per Zooko’s triangle, having a decentralized, secure volume may make the name less human-meaningful, though there may be ways around that.
In an ideal situation, an application (such as a web server, email server, or XMPP server) that uses this storage mechanism would be mostly stateless, relying on the storage volume to maintain state, including both configuration data and application data (such as pictures, web pages, emails, or instant message history). This would allow one to easily spin up or failover to another application server if one application server becomes unavailable.
Key to such a storage mechanism succeeding is having an API that makes it easy for applications that wish to deliver the data being stored to do so. This could be a filesystem-like API, but made simpler, with only file listing and file retrieval operations permitted in the usual case, and file saving operations permitted in a special, more secure mode. This would allow for fairly easy email retrieval (emails are separated into different folders on the “filesystem” and if one wants to read an email, the “file” for that email is retrieved and presented to the user), as well as media sharing (ie. using MediaGoblin) and instant messaging (the XMPP roster being stored in a file, with message history stored in a file per conversation or contact). It also allows for tighter atomicity guarantees, as changes to files can only be made by saving the whole file.
There is still room for the API to expand into other use cases, as I’m sure that the simple API mentioned above won’t completely replace the SQL databases and configuration data methods of storage that people use today. Part of this is because of search, which is probably an area where the API needs to be expanded. If a user wants to search the body of all their emails, it would be fairly slow to do without a special command that could be run on the storage server itself.
It’s possible that people are already working on this problem and I’m not aware of it, or people are attacking subsets of this problem already. I think that remoteStorage is looking to solve similar problems, but with an API aimed at higher-level languages than might be suitable for all use cases (it assumes people are using browser-based applications, effectively). Also, I know people are working on making web server and web application configuration easier and more consistent. But it feels to me that a more holistic approach is likely to achieve greater long-term benefits, as it can solve both the configuration issues and data storage issues at the same time.
Note that there may be some SaaSS issues with the above approach. If a storage volume is being hosted on someone else’s server, then one could say that that server is doing your computing, which is an example of the SaaSS problem. The problem is mitigated to some extent by using your own hardware for the storage volume, but if someone else has physical access to the device, it might not really be “yours”, at least in a security sense. Furthermore, the application servers that use the storage volume may do some computing of their own (though it’s possible that they could be classified as a “communication service” if the processing they do on the data they receive is minimal). In any case, I believe that further analysis is warranted (and encouraged) on this matter, as it’s unclear how much freedom one is giving up by using the storage methods described above. Feel free to post a comment or contact me with your thoughts on this matter.
X. Remove Gnash from the high priority list
Gnash, which is a free software SWF player (but NOT a free software replacement for Adobe Flash Player, as I’ll discuss below), has been on the high priority list for several years now, and indeed appears at the top of the current list at the time of this writing. I personally believe that it is not a priority anymore and should be removed from the list, as technology has moved on and its goal was unachievable to begin with. In particular:
- SWF is a rapidly-moving target. The Gnash home page states that Gnash “supports most SWF v7 features and some SWF v8 and v9” but that “SWF v10 is not supported” while SWF is now at version 25. It would be very difficult to keep up with these changes even with several full-time developers.
- Gnash does not and cannot support the most often used features of Adobe Flash Player, such as playing DRM-encumbered video. These features are not part of the SWF spec and cannot be safely used outside of Adobe Flash Player in many countries due to their laws prohibiting unlocking technology.
- Few sites require Flash or SWF support anymore. I’ve been browsing the Internet without Flash for over 6 years now and haven’t run into a situation where I needed it to view something that was of interest. As mentioned in the blog post I just linked, many computer manufacturers are not even shipping a Flash player anymore (that was as of 3 years ago; I’m sure that has only compounded since then) and I suspect that most people haven’t bothered installing it, as nothing they use needs it anymore. So why work on a replacement for a technology no one uses?
I do believe that working on a free software vector animation tool would be beneficial. But I would advise against using SWF as a file format, unless one is extremely clear about which versions are supported and it is made clear that new versions will not be supported.
Aside from the free software projects mentioned above that I feel need attention, there are various other projects that are not specifically related to free software, but yet feel important in their own right, some of which have been referenced above already (so their link to benefits for free software might already be clear).
The reasons for my interest in such projects stem from my reasons for getting into free software in the first place, which are largely based on control and autonomy, themselves being forms of freedom and necessary for certain types of freedom to exist. So my motivations for improving free software also cause me to look at other ways to improve technology; the more pressing ones I will describe here:
A. Anonymous low-cost online micropayments
Historically, many projects and business have been funded by and flourished through anonymous micropayments with no transaction fees. Examples include newspapers, local buses, used bookstores, and charities. However, while the Internet has made many aspects of our lives easier, it is still not possible to make such micropayments online.
This limits the types of “purely online” projects that are able to fund themselves, as many people would consider paying a few cents for a piece of software or a news article, but may be deterred from doing so if it took more than one click or could not be done anonymously. Projects devoted to funding other projects have sprung up, in part, as a result of this, allowing people to combine their donations to lower the per-donation transaction cost (which tends to start at $0.30). Examples include Gratipay and Snowdrift.coop.
What we really need is a mechanism for doing this without an intermediary and with minimal fees. Paying $0.30 on top of a transaction of $0.05 seems unreasonable, and would deter people from paying small amounts for a service or article that interests them. And having an intermediary whose goal is, in large part, to identify people makes it hard for people to make purchases that their government may not approve of.
While many have promoted Bitcoin as an option here, it does not fit the bill, as purchases are made pseudonymously, not anonymously. Furthermore, transactions can take a while to process, especially if the person making the transaction chooses to use a low transaction fee (in which case the Bitcoin network will take longer to process it as there is less incentive for a miner to do so).
There are some projects that have tackled the Bitcoin anonymity problem, such as Darkcoin (which uses a blind signature, in part, to achieve this) and Zerocash, but they both appear to be less than a year old, and so haven’t achieved wide adoption among online projects that may wish to be funded via micropayments. Another option might be Ripple, which uses existing currencies, but may be harder to use anonymously.
What we really need is an easy payment button using an anonymous payment processing option (possibly one of those listed above) that is widely-adopted by sites that are interested in receiving such payments. Furthermore, the button should offer a single-click payment where possible so that the friction involved with making payments is minimized as much as possible. The user would not leave the page, but the button would merely turn green and change the text to say something like “Successfully paid $X to Organization Z!”
An effective way of reducing this friction would be to have a browser add-on that notices such payment buttons and automatically prepopulates the user’s wallet ID (or similar) so that only a single click is required to make a payment. This add-on could also be configured to allow payments less than a certain amount of money with no additional confirmation (only a single click on the button).
Once users have an easy way to make low-cost anonymous micropayments online, I expect many new types of projects will be created, which thrive under such a payment method, but may be difficult to fund otherwise. As with the Internet, we won’t know what new innovations will be unleashed until we build it.
B. Widely-supported SLAs for dedicated channels and/or DiffServ
With most residential ISPs and broadband routers, it can be difficult to guarantee the quality and reliability of real-time communication such as video or voice. This is often because the broadband router is not great at prioritizing real-time traffic under adverse conditions (just try pinging a remote host while uploading a file to see the latency induced on inbound traffic by that simple operation, which often increases twenty-fold). But it can also be because the ISP is not adequately prioritizing the traffic to the next hop, or that some paths along the way to the destination are clogged and not prioritizing the data properly either, so packets are dropped or the latency increases significantly.
The telephone network, on the other hand, has traditionally been designed to handle real-time communication gracefully under almost all conditions. This is done by giving every user a circuit, which effectively guarantees an amount of bandwidth to the user, and then having the carrier setup an appropriate number of circuits between itself and the next carrier on the way to a call’s final destination. Sometimes the number of circuits is underestimated, in which case the user may receive an “all circuits are busy” message. While denying a user’s call is not an optimal situation, it tends to only happen in extraneous circumstances, and could be planned for in many cases. Outside of such cases, though, the user never encounters any call quality issues as they are always guaranteed an entire circuit of bandwidth, assuming that a circuit is available for their call.
Because of the best effort nature of the Internet, sometimes not enough bandwidth is available from one endpoint to another, and so a call or other real-time communication may suffer from quality issues or be dropped altogether due to other, non-real-time traffic on the network. Since there is no agreed-upon way to prioritize the real-time traffic, the only way to solve the problem is to add capacity – in the meantime, real-time traffic will suffer, without a way for the user to resolve it.
One way to solve this would be for an ISP to guarantee its customers a “dedicated channel” of a certain amount of bandwidth (ie. 512 kbps) for real-time communication. The user’s software or router could tag the real-time packets and as long as the bandwidth used by these packets didn’t exceed the amount the customer agreed to buy from the ISP, then they would be delivered ahead of all non-real-time traffic to the ISP, guaranteed. The ISP could have similar agreements for multiple channels with its neighbouring ISPs and could pass on the data using its own dedicated channels to reach the destination of the real-time traffic. As with the phone network, if any path along the way had insufficient guaranteed bandwidth to fulfill the request, a rejection message could be sent back, and the two ISPs on each end of the link without enough guaranteed bandwidth would receive a notification. But if the connection to the other end was successfully negotiated, the real-time communication would proceed with guaranteed bandwidth so no wayward upload or other high-traffic operation would impede the quality or reliability of the real-time communication.
It appears that there is already a standard field in the IP packet header for this, called differentiated services, or DiffServ. As discussed in a paper by Scott Jordan at UC Irvine, the only real hurdle to using DiffServ is that ISPs lack agreements between each other that specify how much bandwidth is being guaranteed. This would normally be done through SLAs, which the ISPs haven’t bothered to setup. So effectively this project proposal is to get ISPs to coordinate with each other and with their residential customers to guarantee certain amounts of bandwidth for real-time communications.
This level of coordination does seem difficult, which is partly why this project isn’t higher in my list. Also, it seems that much of this would happen naturally if only there was competition in the ISP space, as the different ISPs would then compete on the sorts of SLAs they could provide to customers, and how much guaranteed bandwidth would be available. But because there are few players in the ISP space, this sort of change is unlikely.
C. Cell network without tracking
At Richard Stallman’s 2014-07-21 talk at NYU, after he had explained how cell phones were tracking devices so he did not use one, I asked him if there might be a way to accomplish a similar feature set but without the tracking part. He said that if the device only listened for signals, then it was possible it could be made to not track the user. The user would reveal their location when they made a call or sent a message, but until then they could be free of tracking.
This setup seems feasible to me for cell phone-like communication. The only caveat would be defining a region in which the network should broadcast messages destined for your device, which limits your privacy a little (by defining regions where you are likely to be), but not nearly as much as the current cell network, which can pinpoint you to within a few metres.
For outgoing calls and messages, the network would work the same as cell networks do today: the phone would register with the nearest cell tower and then communicate the message or call it wants to send. But for incoming calls and messages, one would have to first give the network a geographic region where messages should be sent, which would lead to a list of cell towers that would then, when a message was received for that user, repeatedly broadcast the message for a configurable amount of time. This “message” could include an “incoming call” signal so one could just pickup the phone as one does now (but at that point one’s location would be known by the network).
It seems technically feasible to me, though care must be taken to ensure the number of cell towers broadcasting messages does not become too great or that the timeout for repeated broadcasts of a messages is not too long, to avoid overloading the network. But it would allow users to maintain location privacy (to a large extent) while still receiving messages.
I’m not an RF expert so I’m not sure if there may be ways for someone to detect the location of a device that is merely “listening” for a given signal. If you know of one, leave a comment or send me a message.
There are other projects, which are broader and more long-term, that I feel are very important, but I’d rather not discuss them here until I have some more concrete ideas to share. If you’re interested in discussing them in the meantime, let me know.
I’m hopeful that all of the above projects will be completed within my lifetime, but even if only a few are completed, that will be a success. There is much work to do in promoting freedom for people everywhere (not just in software and computing, though I focus on that because it’s what I know). In the coming years, I’ll be trying my best to do my part and I hope that you can too.
If you’re curious about anything I’ve mentioned, are interested to know why I prioritized projects the way I did, want some tips for where to start on a project, or just need some clarification, please leave a comment or send me a message. I’d be happy to discuss it further.