Making distributed systems easier
A major source of complexity in implementing distributed systems today is in the persistence of data. The core of any system is the ability to process information and hence the need to persist information is key.
Unfortunately the majority of mechanisms to store and process data assume that they are the center of the universe and that moving the data in and out is not important, and hence should be done in bulk.
Even modern solutions such as XML Databases (at least the ones that I have worked with require reading all data into an in memory object and then storing). Many years ago I worked on a data analysis system that was based on the idea of infinite object streams - this made everyone think about what was the minimum object and design distributed processing objects naturally.
Short of writing a new stream based XML store myself,anyone got any ideas?